PaperWithoutCode

An agentic tool to generate code prototypes from research papers in https://paperwithoutcode.com/. No Llama-index, no LangChain, just get back to the root of the idea.

Overview

PaperWithoutCode is a Python project of https://paperwithoutcode.com/ that takes an arXiv article PDF link as input, downloads the PDF, converts it to markdown, generates a summary of the paper, create a Python script that demonstrates the paper's core idea, execute in a sandbox, review and improve. It leverages an agentic workflow to dynamically manage all the tasks.

The generated code is executed in a sandbox conda environment, and the results are analyzed to refine the code if necessary. This process is repeated up to a specified number of iterations to produce a working prototype that best illustrate the paper's idea.

Features

Download PDFs from arXiv links
Convert PDFs to markdown using the markitdown package
Generate comprehensive paper summaries using a three-pass approach
Select an appropriate template based on the paper's subfield
Generate code using Claude Sonnet 3.7
Execute the generated code in a sandbox conda environment
Analyze execution results and refine the code iteratively
Support for various computer science subfields with specialized templates

Installation

Automatic Setup (Recommended)

On Linux/macOS:

git clone https://github.com/yourusername/paperwocode.git
cd paperwocode
chmod +x setup.sh  # Make the setup script executable
./setup.sh

On Windows:

git clone https://github.com/yourusername/paperwocode.git
cd paperwocode
setup.bat

The setup script will:

Create the conda environment with all required dependencies
Prompt you for API keys if they're not already set
Create the necessary output directories
Provide instructions for running the project

Manual Setup

Clone the repository:

git clone https://github.com/phunterlau/paper_without_code.git
cd paperwocode

Create and activate the conda environment:
```
conda env create -f environment.yml
conda activate paperwocode
```
This will create a conda environment with all the required dependencies with compatible versions.

Set the required API keys as environment variables:

export ANTHROPIC_API_KEY=your_anthropic_api_key
export OPENAI_API_KEY=your_openai_api_key

On Windows, use:

set ANTHROPIC_API_KEY=your_anthropic_api_key
set OPENAI_API_KEY=your_openai_api_key

Create the output directory structure:
```
mkdir -p output/workflows
```

Usage

Run the script with an arXiv URL:

python main.py https://arxiv.org/abs/2411.16905

Or use the default URL (https://arxiv.org/abs/2411.16905):

python main.py

Command-line Arguments

arxiv_url: URL to the arXiv paper (default: https://arxiv.org/abs/2411.16905)
--force: Force code generation even for non-CS papers
--output, -o: Output directory for generated files (default: "output")
--iterations, -i: Maximum number of iterations for code refinement (default: 3)
--no-cache: Skip using cached results and rerun all steps

Project Structure

.
├── main.py                         # Main entry point
├── setup.py                        # Package setup script
├── setup.sh                        # Setup script for Linux/macOS
├── setup.bat                       # Setup script for Windows
├── requirements.txt                # Project dependencies
├── environment.yml                 # Conda environment specification
├── paperwocode/                    # Main package directory
│   ├── __init__.py                 # Package initialization
│   ├── pdf_downloader.py           # Module for downloading PDFs from arXiv
│   ├── pdf_to_markdown.py          # Module for converting PDFs to markdown
│   ├── paper_summarizer.py         # Module for summarizing papers
│   ├── code_generator.py           # Module for generating code
│   ├── code_executor.py            # Module for executing code in a sandbox
│   └── agent.py                    # Module for orchestrating the workflow
├── templates/                      # Templates for different types of papers
│   ├── gpt_structured_output.py    # Template for LLM-related papers
│   ├── pytorch_model.py            # Template for model-related papers
│   ├── data_analysis.py            # Template for data science papers
│   ├── algorithm.py                # Template for algorithm-related papers
│   └── general_cs.py               # Template for general CS papers
└── output/                         # Base output directory
    └── workflows/                  # Workflows directory
        └── {arxiv_id}/             # Workflow directory for each paper
            ├── {arxiv_id}.pdf      # Downloaded PDF
            ├── {arxiv_id}.md       # Converted markdown
            ├── {arxiv_id}-summary.md # Paper summary
            ├── {arxiv_id}-markmap.md # Mind map
            ├── generated_code.py   # Final generated code
            ├── generated_code_iter1.py # First iteration of code
            ├── generated_code_iter2.py # Second iteration of code
            ├── generated_code_iter3.py # Third iteration of code
            ├── execution_results_iter1.txt # First execution results
            ├── execution_results_iter2.txt # Second execution results
            ├── execution_results_iter3.txt # Third execution results
            ├── code_reflection_iter1.md # First code reflection
            ├── code_reflection_iter2.md # Second code reflection
            ├── code_reflection_iter3.md # Third code reflection
            └── logs/                # Logs directory
                ├── code_generation_*.log # Timestamped log files
                └── code_generation_complete.log # Final log file

Workflow

Download PDF: The PDF is downloaded from the provided arXiv URL.
Convert to Markdown: The PDF is converted to markdown using the markitdown package.
Summarize Paper: The paper is summarized using a three-pass approach.
Check if CS Paper: The system determines if the paper is a computer science paper and identifies its subfield.
Generate Code: Claude Sonnet is used to generate a Python script based on the paper's core idea.
Execute Code: The generated code is executed in a sandbox conda environment.
Analyze Results: The execution results are analyzed to identify any issues.
Refine Code: If issues are found, the code is refined and the process is repeated.
Final Output: The final code is saved to the output directory.

Caching System

The project includes a caching system that saves time by reusing previously generated results:

Automatic Caching: Each step of the workflow checks if the output already exists in the workflow directory.
Cache Usage: If a cached result is found, it's used instead of rerunning the step, which can save significant time, especially for the summarization and code generation steps.
Cache Invalidation: Use the --no-cache flag to skip using cached results and rerun all steps.
Workflow Directory: All cached files are stored in a dedicated directory for each paper (output/workflows/{arxiv_id}/), making it easy to manage and share results.

This caching system is particularly useful when:

You want to experiment with different code generation parameters without re-processing the paper
You're working with the same paper multiple times
You want to share the results with others

Cost Estimation System

The project includes a cost estimation system that calculates and displays the estimated API costs for each step:

Token Counting: The system counts input and output tokens for each API call using the tiktoken library.
Cost Calculation: Based on the token counts, the system calculates the estimated cost using Claude's pricing model.
Transparent Reporting: The estimated cost is displayed after each API call, showing:
- Total cost in USD
- Input token count and cost
- Output token count and cost
Logging: All cost estimates are logged for future reference and budget planning.

This cost estimation system helps you:

Monitor API usage and costs in real-time
Plan and budget for larger projects
Optimize prompts to reduce costs
Track expenses across different papers

Code Reflection System

The project includes an intelligent reflection system that evaluates how well the generated code illustrates the paper's core idea:

Automated Evaluation: After each code generation and execution step, Claude Sonnet evaluates the code against the paper's core idea and major contributions.
Comprehensive Assessment: The reflection considers:
- How well the code illustrates the paper's core idea
- Whether the implementation is too simple or too complex
- Which aspects of the paper's contributions are well-represented
- Which aspects are missing or could be improved
- Specific recommendations for the next iteration
Feedback Loop: The reflection is used as input for the next iteration of code generation, creating a continuous improvement cycle.
Documentation: All reflections are saved as markdown files in the workflow directory for future reference.

This reflection system ensures that the generated code not only runs correctly but also effectively demonstrates the paper's key concepts and contributions.

Logging System

The project includes a comprehensive logging system that records all steps of the code generation process:

Detailed Logs: All code generation steps, API calls, execution results, analysis, and reflections are logged in detail.
Log Directory: Logs are stored in a dedicated directory for each paper (output/workflows/{arxiv_id}/logs/).
Timestamped Logs: Each log file is timestamped to track different runs.
Complete Record: The entire process is recorded, including:
- Claude API prompts and responses
- Code execution outputs and errors
- Analysis of execution results
- Code reflections and evaluations
- Final code generation results

This logging system is particularly useful for:

Debugging code generation issues
Understanding why certain code was generated
Tracking the evolution of code across iterations
Auditing the code generation process

Templates

The system includes several templates for different types of computer science papers:

gpt_structured_output.py: For LLM-related papers, using GPT-4o-mini for structured output.
pytorch_model.py: For model-related papers, using PyTorch.
data_analysis.py: For data science papers, using pandas, scikit-learn, and matplotlib.
algorithm.py: For algorithm-related papers, with benchmarking capabilities.
general_cs.py: For general computer science papers.

Requirements

Python 3.10+
Conda (for creating the sandbox environment)
Anthropic API key (for Claude Sonnet)
OpenAI API key (for GPT-4o-mini)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PaperWithoutCode

Overview

Features

Installation

Automatic Setup (Recommended)

On Linux/macOS:

On Windows:

Manual Setup

Usage

Command-line Arguments

Project Structure

Workflow

Caching System

Cost Estimation System

Code Reflection System

Logging System

Templates

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
examples		examples
images		images
output/workflow/2411.16905		output/workflow/2411.16905
paperwocode		paperwocode
templates		templates
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
main.py		main.py
requirements.txt		requirements.txt
setup.bat		setup.bat
setup.py		setup.py
setup.sh		setup.sh

License

phunterlau/paper_without_code

Folders and files

Latest commit

History

Repository files navigation

PaperWithoutCode

Overview

Features

Installation

Automatic Setup (Recommended)

On Linux/macOS:

On Windows:

Manual Setup

Usage

Command-line Arguments

Project Structure

Workflow

Caching System

Cost Estimation System

Code Reflection System

Logging System

Templates

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages