A production-ready RAG system built with:
- 💡 OpenAI for embeddings and completions
- 📦 ChromaDB as the local vector store
- ⚡ FastAPI for serving as an API
- 🧱 Modular code structure with utilities, configs, and logging
rag-gemini-app/
├── app/ # Core logic: ingestion, retrieval, generation
│ ├── ingest.py
│ ├── retriever.py
│ ├── generator.py
│ ├── rag_pipeline.py
│ ├── utils.py
│ └── __init__.py
├── api/ # FastAPI server
│ ├── main.py
│ ├── routes.py
│ └── schemas.py
├── data/ # Your input text files
│ └── documents/
├── vector_store/ # Chroma persistence
├── .env # API keys and environment variables
├── .gitignore
├── requirements.txt
├── run.py # CLI interface for RAG
└── README.md
git clone https://github.com/yourusername/rag-gemini-app.git
cd rag-gemini-app
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txtOPENAI_API_KEY=your-openai-key-herePlace your .txt files into data/documents/, then run:
python app/ingest.pypython run.pyuvicorn api.main:app --reloadThen open: http://localhost:8000/docs
-
Ingestion:
- Text files → Chunked → Embedded using OpenAI
- Stored in ChromaDB with metadata
-
Retrieval:
- Query embedded → Similarity search via Chroma
-
Generation:
- Top-k chunks + Query → Prompt sent to OpenAI ChatCompletion
- Python 3.9+
- OpenAI API Key
- Internet access for embedding & LLM
- Improve chunking with NLP (spaCy, LangChain)
- Add document upload via API
- Add
/ingestand/healthendpoints - Optional: Swap ChromaDB for FAISS or Qdrant in production
MIT
Made with ❤️ by Jenish Thapa