A powerful web application that leverages Large Language Models (LLMs) to automatically summarize web content. Simply provide a URL, and get an intelligent summary powered by local Ollama models.
- URL-based Summarization: Extract and summarize content from any web URL
- Local LLM Processing: Uses Ollama for privacy-focused, offline text processing
- Web Interface: Clean, intuitive Streamlit-based user interface
- Multiple Format Support: Handles various web content types and structures
- Jupyter Notebook: Interactive development environment for experimentation
- Frontend: Streamlit
- Web Scraping: BeautifulSoup4 + Requests
- LLM Integration: Ollama
- Language: Python 3.8+
- Development: Jupyter Notebook, IPython
Before running this application, ensure you have:
- Python 3.8 or higher installed
- Ollama installed and running locally
- At least one LLM model downloaded in Ollama (e.g., llama2, mistral, etc.)
Visit the official Ollama website: https://ollama.com/download
-
Download: Go to https://ollama.com/download
-
Click: "Download for Windows" button
-
Run: Execute the downloaded
.exe
file -
Install: Follow the installation wizard
-
Alternative: Use Package Manager
# Using winget winget install ollama # Using chocolatey choco install ollama
- Download: Go to https://ollama.com/download
- Click: "Download for macOS" button
- Install: Open the downloaded
.dmg
file and drag Ollama to Applications - Alternative: Use Homebrew
brew install ollama
-
Quick Install: Use the official install script
curl -fsSL https://ollama.com/install.sh | sh
-
Manual Download: Visit https://ollama.com/download and download the Linux binary
-
Package Managers:
# Ubuntu/Debian (if available) sudo apt install ollama # Arch Linux yay -S ollama
After installation, verify Ollama is working:
# Check if Ollama is installed
ollama --version
# Start Ollama service
ollama serve
Choose and download at least one model:
# Popular models (choose one or more)
# Llama 2 (7B) - Good balance of performance and speed
ollama pull llama2
# Mistral (7B) - Fast and efficient
ollama pull mistral
# Code Llama (7B) - Great for code-related tasks
ollama pull codellama
# Llama 2 (13B) - Better performance, larger size
ollama pull llama2:13b
# Check available models
ollama list
Test if everything is working:
# Test with a simple prompt
ollama run llama2 "Hello, how are you?"
# Or start an interactive session
ollama run llama2
-
Clone the repository
git clone https://github.com/NhanPhamThanh-IT/LLM-Web-Summarizer.git cd LLM-Web-Summarizer
-
Create a virtual environment (recommended)
python -m venv venv # Activate virtual environment # Windows venv\Scripts\activate # macOS/Linux source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt
-
Ensure Ollama is running
# Start Ollama service (if not already running) ollama serve # In a new terminal, verify models are available ollama list
-
Run the Streamlit application
streamlit run text_summary_app.py
-
Access the application
- Open your browser and go to
http://localhost:8501
- Open your browser and go to
- Start the application using the installation steps above
- Enter a URL in the input field
- Click "Summarize" to process the content
- View the summary generated by the LLM
For development and experimentation:
jupyter notebook text_summarize.ipynb
LLM-Web-Summarizer/
├── text_summary_app.py # Main Streamlit application
├── text_summarize.ipynb # Jupyter notebook for development
├── requirements.txt # Python dependencies
├── README.md # Project documentation
└── .gitignore # Git ignore file
Edit the summarize
function in text_summary_app.py
:
def summarize(text):
response = ollama.chat(model='your-preferred-model', messages=[
{
'role': 'user',
'content': f'Summarize the following text: {text}'
}
])
return response['message']['content']
You can use any of these models in your application:
llama2
- General purpose, good balancemistral
- Fast and efficientcodellama
- Best for code-related contentllama2:13b
- More capable but slowermixtral
- Advanced reasoning capabilities
Modify the prompt in the summarize
function to adjust the summarization style:
content = f'Provide a detailed summary with key points: {text}'
# or
content = f'Create a brief, bullet-point summary: {text}'
-
Ollama Not Found
Error: ollama: command not found
Solution:
- Ensure Ollama is properly installed from https://ollama.com/download
- Restart your terminal after installation
- Check if Ollama is in your PATH
-
Ollama Connection Error
Error: Connection refused
Solution:
- Ensure Ollama service is running (
ollama serve
) - Check if port 11434 is available
- Try restarting Ollama service
- Ensure Ollama service is running (
-
Model Not Found
Error: Model 'llama2' not found
Solution:
- Download the model (
ollama pull llama2
) - Check available models (
ollama list
) - Verify model name spelling in your code
- Download the model (
-
Streamlit Port Already in Use
Error: Port 8501 is already in use
Solution: Use a different port
streamlit run text_summary_app.py --server.port 8502
-
Web Scraping Blocked
Error: 403 Forbidden
Solution: Some websites block automated requests. Try different URLs or add headers to requests.
-
Slow Performance Solution:
- Use smaller models like
mistral
instead ofllama2:13b
- Ensure sufficient RAM is available
- Close other resource-intensive applications
- Use smaller models like
Model | Size | Speed | Quality | Use Case |
---|---|---|---|---|
mistral |
7B | Fast | Good | General summarization |
llama2 |
7B | Medium | Very Good | Balanced performance |
codellama |
7B | Medium | Good | Technical content |
llama2:13b |
13B | Slow | Excellent | High-quality summaries |
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Ollama for providing local LLM capabilities
- Streamlit for the excellent web framework
- BeautifulSoup for web scraping functionality
Pham Thanh Nhan - ptnhanit230104@gmail.com
Project Link: https://github.com/NhanPhamThanh-IT/LLM-Web-Summarizer