RAG Chatbot Suite for Electrical Codes

A comprehensive suite of Retrieval Augmented Generation (RAG) chatbots for electrical codes and standards, supporting both OpenAI and Google Gemini APIs.

📁 Project Structure

RAG/
├── json-chunked_rag/                       # JSON-based RAG system
│   ├── data/                          # JSON data files
│   │   ├── pec_text_chunks_prod.json # Text chunks (4.8MB)
│   │   ├── PEC_tables.json           # Table data (0.3MB)
│   │   └── PEC_tables_row_chunks.json # Table rows (1.0MB)
│   ├── src/                           # Source code
│   │   ├── json_rag_chatbot.py        # Streamlit web app
│   │   ├── simple_json_rag.py         # Command-line interface
│   │   ├── data_inspector.py          # Data analysis utility
│   │   └── ai_provider.py             # AI provider abstraction
│   ├── config.py                      # Configuration management
│   ├── requirements.txt               # Dependencies
│   ├── setup.py                       # Setup script
│   └── README.md                      # JSON RAG documentation
├── pdf-chunked_rag/                           # PDF-based RAG system
│   ├── data/                          # PDF and related files
│   │   ├── PEC_Content_1-4_combined.pdf # Main PDF (56MB)
│   │   ├── pec_text_chunks_prod.json # Pre-processed chunks
│   │   └── ... (other data files)
│   ├── src/                           # Source code
│   │   ├── chatbot_rag.py             # Streamlit web app
│   │   ├── simple_rag.py              # Command-line interface
│   │   └── ai_provider.py             # AI provider abstraction
│   ├── config.py                      # Configuration management
│   ├── requirements.txt               # Dependencies
│   ├── setup.py                       # Setup script
│   └── README.md                      # PDF RAG documentation
├── env_template.txt                   # Environment setup template
└── README.md                          # This file

🚀 Features

Dual AI Provider Support

🤖 OpenAI GPT Models - GPT-3.5-turbo, GPT-4, etc.
🔮 Google Gemini - Gemini-1.5-flash and other models
🔄 Easy switching between providers via configuration

Two Complete RAG Systems

JSON RAG (chunked_app/) - Optimized for pre-processed JSON data
PDF RAG (pdf_rag/) - Direct PDF processing with text extraction

Multiple Interfaces

💬 Web Interface - Beautiful Streamlit applications
🖥️ Command Line - Simple CLI for testing and automation
📊 Data Tools - Analysis and inspection utilities

⚡ Quick Start

1. Choose Your System

For JSON-based RAG (Recommended):

cd chunked_app

For PDF-based RAG:

cd pdf_rag

2. Setup Environment

Create a .env file in your chosen directory:

# Copy the template
cp ../env_template.txt .env
# Edit with your preferred editor
notepad .env  # or nano, vim, etc.

Example .env configuration:

AI_PROVIDER=openai
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-3.5-turbo

3. Install Dependencies

python setup.py
# or manually:
pip install -r requirements.txt

4. Run the Application

Web Interface:

streamlit run src/json_rag_chatbot.py  # or src/chatbot_rag.py for PDF

Command Line:

python src/simple_json_rag.py  # or src/simple_rag.py for PDF

🔧 Configuration

AI Provider Selection

Set AI_PROVIDER in your .env file:

Provider	Model Options	API Key Variable
`openai`	`gpt-3.5-turbo`, `gpt-4`, `gpt-4o`	`OPENAI_API_KEY`
`gemini`	`gemini-1.5-flash`, `gemini-pro`	`GEMINI_API_KEY`

System Parameters

CHUNK_SIZE=1000              # Text chunk size
CHUNK_OVERLAP=200            # Overlap between chunks
MAX_RETRIEVAL_CHUNKS=5       # Number of chunks to retrieve
EMBEDDING_MODEL=all-MiniLM-L6-v2  # Sentence transformer model

📊 System Comparison

Feature	JSON RAG	PDF RAG
Data Source	Pre-processed JSON	Direct PDF processing
Setup Speed	Fast (data ready)	Slower (PDF extraction)
Search Quality	High (optimized chunks)	Good (raw extraction)
Memory Usage	Lower	Higher
Customization	High	Medium
Data Size	~6MB JSON files	56MB PDF + chunks

🎯 Use Cases

JSON RAG - Best for:

✅ Production deployments
✅ High-performance search
✅ Pre-processed data
✅ Custom data structures

PDF RAG - Best for:

✅ Quick prototyping
✅ Direct PDF analysis
✅ Simple setup
✅ Document exploration

🔍 Example Queries

Both systems can answer questions like:

General: "What are the main electrical safety requirements?"
Specific: "What should be the distance between receptacles in walls?"
Technical: "Show me motor starting current calculations"
Codes: "What does NEC 210.12 require for AFCI protection?"

🐛 Troubleshooting

Common Issues

API Key Not Configured
```
Error: OPENAI_API_KEY must be set when using OpenAI
```
- Create .env file with your API keys
- Restart the application
Module Import Errors
```
ModuleNotFoundError: No module named 'openai'
```
- Run pip install -r requirements.txt
- Check you're in the correct directory
Data Files Missing
```
Error: File not found in data/
```
- Verify data files are in the correct folders
- Run setup script to check file locations

Performance Tips

Memory: Use JSON RAG for better memory efficiency
Speed: Pre-process data for faster startup
Accuracy: Adjust MAX_RETRIEVAL_CHUNKS for better context

🔐 Security

✅ API keys stored in .env files (not in code)
✅ Local processing (no data sent except to chosen AI provider)
✅ ChromaDB local vector storage
✅ No hardcoded credentials

📈 Performance Metrics

JSON RAG Performance

Chunks: ~3,006 searchable chunks
Startup: <2 minutes (first run)
Query Response: 2-5 seconds
Memory: ~500MB RAM

PDF RAG Performance

Chunks: ~1,430 text chunks
Startup: 3-10 minutes (PDF processing)
Query Response: 2-5 seconds
Memory: ~800MB RAM

🤝 Contributing

Choose the appropriate system folder
Follow the existing code structure
Update configuration as needed
Test with both AI providers
Update documentation

📄 License

This project is for educational and research purposes. Please comply with the terms of service for your chosen AI provider (OpenAI/Google).

Get Started: Navigate to either chunked_app/ or pdf_rag/ and follow their respective README files for detailed setup instructions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Chatbot Suite for Electrical Codes

📁 Project Structure

🚀 Features

Dual AI Provider Support

Two Complete RAG Systems

Multiple Interfaces

⚡ Quick Start

1. Choose Your System

2. Setup Environment

3. Install Dependencies

4. Run the Application

🔧 Configuration

AI Provider Selection

System Parameters

📊 System Comparison

🎯 Use Cases

JSON RAG - Best for:

PDF RAG - Best for:

🔍 Example Queries

🐛 Troubleshooting

Common Issues

Performance Tips

🔐 Security

📈 Performance Metrics

JSON RAG Performance

PDF RAG Performance

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
json-chunked_rag		json-chunked_rag
pdf-chunked_rag		pdf-chunked_rag
README.md		README.md
env_template.txt		env_template.txt

Norbera0/RAG-Chatbot-Suite-for-Philippine-Electrical-Code

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot Suite for Electrical Codes

📁 Project Structure

🚀 Features

Dual AI Provider Support

Two Complete RAG Systems

Multiple Interfaces

⚡ Quick Start

1. Choose Your System

2. Setup Environment

3. Install Dependencies

4. Run the Application

🔧 Configuration

AI Provider Selection

System Parameters

📊 System Comparison

🎯 Use Cases

JSON RAG - Best for:

PDF RAG - Best for:

🔍 Example Queries

🐛 Troubleshooting

Common Issues

Performance Tips

🔐 Security

📈 Performance Metrics

JSON RAG Performance

PDF RAG Performance

🤝 Contributing

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages