Intellexa is a lightweight and experimental RAG (Retrieval-Augmented Generation) based system aimed at assisting users by analyzing their HTML-based website content and generating intelligent answers or summaries related to that content.
This project is currently limited to HTML-based websites only. Non-HTML-based sites or JavaScript-heavy SPAs are not yet supported.
To create a simple and practical AI assistant that:
- Crawls and processes a user-provided HTML website.
- Converts the website content into vector embeddings.
- Stores and indexes the embeddings in Qdrant (a vector database).
- Uses a RAG pipeline with LangChain to retrieve relevant context and answer queries based on the content.
- Provides a clean UI for inputting a website and asking questions.
This project is designed as a proof-of-concept and is not yet production-ready.
- Form to submit website URL and direct text filling form.
- Chat-like UI for asking questions.
- Loading indicators and basic user experience enhancements.
- Handles initial website crawling and sends it to the Python backend.
- Serves as a communication bridge between frontend and ML components.
- Extracts text from the HTML using langchain.
- Splits the text into chunks using LangChain's
TextSplitter
. - Converts chunks to embeddings using
sentence-transformers
. - Stores vectors in Qdrant for similarity search.
- Uses RAG (Retrieval-Augmented Generation) to answer user queries using retrieved chunks.
- Placeholder for future Neo4j integration (currently unused).
- ✅ Website crawling and scraping (HTML only).
- ✅ Chunking and embedding generation.
- ✅ Vector storage and retrieval using Qdrant.
- ✅ Question answering using LangChain RAG pipeline.
- ✅ Seamless multi-service interaction (Node + Python + Frontend).
- ✅ Clean GitHub structure with appropriate
.gitignore
in all folders.
Intellexa/
├── client/ # React frontend (Vite)
├── backend/ # Node.js backend (Crawling & API)
├── python-backend/ # FastAPI + LangChain backend
├── docker-compose.yml
├── README.md
git clone https://github.com/kunalverma2512/Intellexa.git
cd Intellexa
cd client
npm install
npm run dev
cd backend
npm install
npm run dev
Make sure you have Docker and Docker Compose installed.
From the root directory of the project, run:
docker-compose up --build