Skip to content

Conversation

helloaidank
Copy link
Contributor

Summary

Added comprehensive analysis notebooks for ASF mission focusing on affordability and efficiency in heating solutions.

Notebooks Added

  • CB_hp_efficiency.ipynb - Crunchbase analysis of heat pump efficiency companies
  • gtr_lch_efficiency_search.ipynb - GTR analysis of low-carbon heating efficiency research
  • CB_LC_green_finance_search.ipynb - Crunchbase green finance analysis
  • gtr_green_finance_search.ipynb - GTR green finance research analysis

Key Features

  • Time series analysis with gap imputation for consistent visualizations
  • Google Sheets integration for collaborative data sharing
  • Organized directory structure for data outputs
  • Enhanced .gitignore to exclude generated files

Technical Improvements

  • Standardized notebook structure across CB and GTR analyses
  • Comprehensive error handling and data validation
  • Chart data export for Google Sheets visualization
  • Configuration files for reproducible analysis

Review Instructions

  1. Notebooks are ready to run with proper credentials (Crunchbase S3, OpenAI API, Google Sheets)
  2. Generated outputs (CSVs, PNGs, HTML) are excluded from git tracking
  3. Each notebook includes comprehensive documentation and usage instructions

Next Steps

  • Review notebook structure and approach
  • Test with real data sources
  • Integrate with main pipeline if approved

🤖 Generated with Claude Code

- Added CB_hp_efficiency.ipynb and gtr_lch_efficiency_search.ipynb for heat pump efficiency analysis
- Added CB_LC_green_finance_search.ipynb and gtr_green_finance_search.ipynb for green finance analysis
- Enhanced notebooks with time series imputation and Google Sheets integration
- Updated .gitignore to exclude generated outputs (CSVs, PNGs, HTML)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@beingkk beingkk requested review from Copilot and beingkk July 30, 2025 13:13
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive analysis notebooks for the ASF mission focusing on affordability and efficiency in heating solutions through four Jupyter notebooks covering heat pump efficiency research and green finance analysis.

Key changes:

  • Implements dual-stage filtering approach for green finance research (broad green finance → low-carbon heating specific)
  • Adds LLM-validated analysis for heat pump efficiency research projects
  • Standardizes notebook structure across both Crunchbase (CB) and Gateway to Research (GTR) data sources

Reviewed Changes

Copilot reviewed 8 out of 11 changed files in this pull request and generated 5 comments.

File Description
gtr_green_finance_search.ipynb Two-stage GTR analysis notebook with progressive filtering and comprehensive documentation
gtr_lch_efficiency_search.ipynb GTR heat pump efficiency analysis with simplified directory structure and chart generation
config files (*.yaml) Search configuration files defining keywords and scope statements for both CB and GTR analyses

" except Exception as e:\n",
" print(f\"❌ Error uploading {config_name} to Google Sheets: {e}\")\n",
"\n",
"print(\"\\\\n🎯 Main data Google Sheets upload complete!\")"
Copy link
Preview

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string contains an escaped backslash that will be printed literally as '\n' instead of a newline. Remove one backslash to fix: print("\n🎯 Main data Google Sheets upload complete!")

Suggested change
"print(\"\\\\n🎯 Main data Google Sheets upload complete!\")"
"print(\"\\n🎯 Main data Google Sheets upload complete!\")"

Copilot uses AI. Check for mistakes.

" import traceback\n",
" traceback.print_exc()\n",
"\n",
"print(\"\\\\n🎯 Chart data Google Sheets upload complete!\")"
Copy link
Preview

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string contains an escaped backslash that will be printed literally as '\n' instead of a newline. Remove one backslash to fix: print("\n🎯 Chart data Google Sheets upload complete!")

Suggested change
"print(\"\\\\n🎯 Chart data Google Sheets upload complete!\")"
"print(\"\\n🎯 Chart data Google Sheets upload complete!\")"

Copilot uses AI. Check for mistakes.

Comment on lines 42 to 43
"from pathlib import Path\n",
"#from discovery_mission_radar import VECTOR_DB_DIR"
Copy link
Preview

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented-out import that is not needed. The VECTOR_DB_DIR is created locally within the notebook.

Suggested change
"from pathlib import Path\n",
"#from discovery_mission_radar import VECTOR_DB_DIR"
"from pathlib import Path\n"

Copilot uses AI. Check for mistakes.

" session_name=\"mission_studio\",\n",
" output_fields=[{\"name\":\"is_relevant\",\"type\":\"str\",\"description\":\"yes or no\"}]\n",
" )\n",
" await proc1.run(dict(zip(relevant_gf['id'], relevant_gf['abstractText'])), batch_size=15, sleep_time=0.5)\n",
Copy link
Preview

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider extracting the batch_size and sleep_time as configurable constants at the top of the notebook to make it easier to adjust LLM processing parameters without searching through the code.

Suggested change
" await proc1.run(dict(zip(relevant_gf['id'], relevant_gf['abstractText'])), batch_size=15, sleep_time=0.5)\n",
" await proc1.run(dict(zip(relevant_gf['id'], relevant_gf['abstractText'])), batch_size=BATCH_SIZE, sleep_time=SLEEP_TIME)\n",

Copilot uses AI. Check for mistakes.

" session_name='mission_studio',\n",
" output_fields=[{'name':'is_relevant','type':'str','description':'yes or no'}]\n",
" )\n",
" await processor.run(dict(zip(relevant['id'], relevant['abstractText'])), batch_size=15, sleep_time=0.5)\n",
Copy link
Preview

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider extracting the batch_size and sleep_time as configurable constants at the top of the notebook to make it easier to adjust LLM processing parameters without searching through the code.

Suggested change
" await processor.run(dict(zip(relevant['id'], relevant['abstractText'])), batch_size=15, sleep_time=0.5)\n",
" await processor.run(dict(zip(relevant['id'], relevant['abstractText'])), batch_size=BATCH_SIZE, sleep_time=SLEEP_TIME)\n",

Copilot uses AI. Check for mistakes.

helloaidank and others added 6 commits July 30, 2025 17:36
Add comprehensive GtR analysis notebook for low-carbon heating (LCH)
optimisation research projects including data extraction, filtering,
and relevance checking workflows.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants