Document Intelligence System

A powerful document analysis and question-answering system built with Streamlit, LangChain, and Ollama. This application enables users to interact with their documents through natural language queries, leveraging advanced RAG (Retrieval Augmented Generation) technology.

Features

📄 Multi-format Document Support (PDF, CSV, TXT)
💬 Interactive Chat Interface
🔄 Batch Query Processing
📊 Document Processing with Advanced RAG
🚀 Optimized Retrieval System
📥 Exportable Results in JSON Format
🔄 Real-time Streaming Responses
🎯 Context-aware Document Analysis

Requirements

Python 3.8+
Streamlit
LangChain
Ollama (running locally or on a remote server)
FAISS for vector storage

Installation

Clone the repository:

git clone https://github.com/tankwin08/doc_intelligence_process.git
cd doc_intelligence_process

Install the required dependencies:

pip install -r requirements.txt

Start the Streamlit app:

streamlit run streamlit_app.py

Open your browser and navigate to the URL displayed in the terminal (typically http://localhost:8501 )

Supported Document Types

PDF (.pdf)
CSV (.csv)
Text (.txt)
Word Documents (.docx, .doc)
PowerPoint (.ppt, .pptx)
Images (.jpg, .jpeg, .png)

Models

The application uses Ollama to run LLMs locally. By default, it uses:

deepseek-r1:latest for text generation
nomic-embed-text for embeddings You can change the model in the sidebar of the application.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.devcontainer		.devcontainer
.streamlit		.streamlit
assets		assets
notebooks		notebooks
pages		pages
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
doc_intelligence_architecture.drawio		doc_intelligence_architecture.drawio
requirement.txt		requirement.txt
streamlit_app_local.py		streamlit_app_local.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Document Intelligence System

Features

Requirements

Installation

Supported Document Types

Models

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

tankwin08/doc_intelligence_process

Folders and files

Latest commit

History

Repository files navigation

Document Intelligence System

Features

Requirements

Installation

Supported Document Types

Models

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages