Skip to content
This repository was archived by the owner on Jul 24, 2025. It is now read-only.

🤖 A powerful web application that automatically summarizes web content using local Large Language Models via Ollama. Built with Streamlit and BeautifulSoup, simply input any URL and get intelligent summaries with complete privacy - no data sent to APIs.

License

Notifications You must be signed in to change notification settings

NhanPhamThanh-IT/LLM-Web-Summarizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM-Web-Summarizer

Python Streamlit License Ollama BeautifulSoup

A powerful web application that leverages Large Language Models (LLMs) to automatically summarize web content. Simply provide a URL, and get an intelligent summary powered by local Ollama models.

🌟 Features

  • URL-based Summarization: Extract and summarize content from any web URL
  • Local LLM Processing: Uses Ollama for privacy-focused, offline text processing
  • Web Interface: Clean, intuitive Streamlit-based user interface
  • Multiple Format Support: Handles various web content types and structures
  • Jupyter Notebook: Interactive development environment for experimentation

🛠️ Technology Stack

  • Frontend: Streamlit
  • Web Scraping: BeautifulSoup4 + Requests
  • LLM Integration: Ollama
  • Language: Python 3.8+
  • Development: Jupyter Notebook, IPython

📋 Prerequisites

Before running this application, ensure you have:

  1. Python 3.8 or higher installed
  2. Ollama installed and running locally
  3. At least one LLM model downloaded in Ollama (e.g., llama2, mistral, etc.)

🔽 Ollama Download & Setup

Step 1: Download Ollama

Visit the official Ollama website: https://ollama.com/download

Windows

  1. Download: Go to https://ollama.com/download

  2. Click: "Download for Windows" button

  3. Run: Execute the downloaded .exe file

  4. Install: Follow the installation wizard

  5. Alternative: Use Package Manager

    # Using winget
    winget install ollama
    
    # Using chocolatey
    choco install ollama

macOS

  1. Download: Go to https://ollama.com/download
  2. Click: "Download for macOS" button
  3. Install: Open the downloaded .dmg file and drag Ollama to Applications
  4. Alternative: Use Homebrew
    brew install ollama

Linux

  1. Quick Install: Use the official install script

    curl -fsSL https://ollama.com/install.sh | sh
  2. Manual Download: Visit https://ollama.com/download and download the Linux binary

  3. Package Managers:

    # Ubuntu/Debian (if available)
    sudo apt install ollama
    
    # Arch Linux
    yay -S ollama

Step 2: Verify Installation

After installation, verify Ollama is working:

# Check if Ollama is installed
ollama --version

# Start Ollama service
ollama serve

Step 3: Download LLM Models

Choose and download at least one model:

# Popular models (choose one or more)

# Llama 2 (7B) - Good balance of performance and speed
ollama pull llama2

# Mistral (7B) - Fast and efficient
ollama pull mistral

# Code Llama (7B) - Great for code-related tasks
ollama pull codellama

# Llama 2 (13B) - Better performance, larger size
ollama pull llama2:13b

# Check available models
ollama list

Step 4: Test Ollama

Test if everything is working:

# Test with a simple prompt
ollama run llama2 "Hello, how are you?"

# Or start an interactive session
ollama run llama2

🚀 Installation & Setup

  1. Clone the repository

    git clone https://github.com/NhanPhamThanh-IT/LLM-Web-Summarizer.git
    cd LLM-Web-Summarizer
  2. Create a virtual environment (recommended)

    python -m venv venv
    
    # Activate virtual environment
    # Windows
    venv\Scripts\activate
    
    # macOS/Linux
    source venv/bin/activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Ensure Ollama is running

    # Start Ollama service (if not already running)
    ollama serve
    
    # In a new terminal, verify models are available
    ollama list
  5. Run the Streamlit application

    streamlit run text_summary_app.py
  6. Access the application

    • Open your browser and go to http://localhost:8501

📖 Usage

Web Application

  1. Start the application using the installation steps above
  2. Enter a URL in the input field
  3. Click "Summarize" to process the content
  4. View the summary generated by the LLM

Jupyter Notebook

For development and experimentation:

jupyter notebook text_summarize.ipynb

🏗️ Project Structure

LLM-Web-Summarizer/
├── text_summary_app.py      # Main Streamlit application
├── text_summarize.ipynb     # Jupyter notebook for development
├── requirements.txt         # Python dependencies
├── README.md               # Project documentation
└── .gitignore             # Git ignore file

🔧 Configuration

Changing the LLM Model

Edit the summarize function in text_summary_app.py:

def summarize(text):
    response = ollama.chat(model='your-preferred-model', messages=[
        {
            'role': 'user',
            'content': f'Summarize the following text: {text}'
        }
    ])
    return response['message']['content']

Available Models

You can use any of these models in your application:

  • llama2 - General purpose, good balance
  • mistral - Fast and efficient
  • codellama - Best for code-related content
  • llama2:13b - More capable but slower
  • mixtral - Advanced reasoning capabilities

Customizing Summarization Prompts

Modify the prompt in the summarize function to adjust the summarization style:

content = f'Provide a detailed summary with key points: {text}'
# or
content = f'Create a brief, bullet-point summary: {text}'

🐛 Troubleshooting

Common Issues

  1. Ollama Not Found

    Error: ollama: command not found
    

    Solution:

    • Ensure Ollama is properly installed from https://ollama.com/download
    • Restart your terminal after installation
    • Check if Ollama is in your PATH
  2. Ollama Connection Error

    Error: Connection refused
    

    Solution:

    • Ensure Ollama service is running (ollama serve)
    • Check if port 11434 is available
    • Try restarting Ollama service
  3. Model Not Found

    Error: Model 'llama2' not found
    

    Solution:

    • Download the model (ollama pull llama2)
    • Check available models (ollama list)
    • Verify model name spelling in your code
  4. Streamlit Port Already in Use

    Error: Port 8501 is already in use
    

    Solution: Use a different port

    streamlit run text_summary_app.py --server.port 8502
  5. Web Scraping Blocked

    Error: 403 Forbidden
    

    Solution: Some websites block automated requests. Try different URLs or add headers to requests.

  6. Slow Performance Solution:

    • Use smaller models like mistral instead of llama2:13b
    • Ensure sufficient RAM is available
    • Close other resource-intensive applications

📊 Model Recommendations

Model Size Speed Quality Use Case
mistral 7B Fast Good General summarization
llama2 7B Medium Very Good Balanced performance
codellama 7B Medium Good Technical content
llama2:13b 13B Slow Excellent High-quality summaries

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📧 Contact

Pham Thanh Nhan - ptnhanit230104@gmail.com

Project Link: https://github.com/NhanPhamThanh-IT/LLM-Web-Summarizer


⭐ If this project helped you, please give it a star!

About

🤖 A powerful web application that automatically summarizes web content using local Large Language Models via Ollama. Built with Streamlit and BeautifulSoup, simply input any URL and get intelligent summaries with complete privacy - no data sent to APIs.

Topics

Resources

License

Stars

Watchers

Forks