Skip to content

Austin's Multilanguage Invoice Extractor is a Streamlit-based web application designed to analyze invoice images using Google Gemini AI's generative capabilities. The application allows users to upload invoice images and provides detailed answers to any input queries regarding the uploaded invoices.

Notifications You must be signed in to change notification settings

Algorithmia-SE/Invoice-Extractor-LLM-App

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Multilanguage Invoice Extractor Documentation


Project Overview

Austin's Multilanguage Invoice Extractor is a Streamlit-based web application designed to analyze invoice images using Google Gemini AI's generative capabilities. The application allows users to upload invoice images and provides detailed answers to any input queries regarding the uploaded invoices.


Table of Contents

  1. Getting Started
  2. File Structure
  3. Installation Guide
  4. Running the Application
  5. Environment Variables
  6. Usage Guide
  7. Error Handling
  8. Dependencies

Getting Started

This application provides a streamlined approach to extract, understand, and analyze invoice images in multiple languages using a combination of Google’s Generative AI (Gemini model), Python libraries for handling PDFs, and machine learning tools.

The application takes as input an image of an invoice and a user prompt (question), and it generates a detailed response based on the content of the invoice image. This is done using Google's Generative AI.


File Structure

  • app.py: The main Streamlit app that handles user interaction and invoice image processing.
  • requirements.txt: Lists the required Python libraries and dependencies.
  • .env: Contains the environment variables, including the Google API key.
  • venv: Python virtual environment for isolated package management.
  • README.md: Documentation for the application.

Installation Guide

1. Clone the Repository

To start, clone the repository to your local machine:

git clone <repository-url>
cd multilanguage-invoice-extractor

2. Set up the Virtual Environment

You can use a virtual environment to manage dependencies. If not created, do the following:

python3 -m venv venv
source venv/bin/activate   # For Mac/Linux
# or
venv\Scripts\activate.bat  # For Windows

3. Install Dependencies

Install the necessary dependencies using pip from the requirements.txt file:

pip install -r requirements.txt

This will install all the required libraries like streamlit, google-generativeai, langchain, etc.

4. Configure Environment Variables

You will need to create a .env file to store your API key for Google Generative AI. Make sure your .env file contains the following:

GOOGLE_API_KEY=<your-google-api-key>

Replace <your-google-api-key> with your actual Google Generative AI API key.


Running the Application

Once all dependencies are installed and the environment is configured, you can run the application using Streamlit. Use the following command to launch the app:

streamlit run app.py

This will launch the Streamlit app in your default web browser.


Environment Variables

The .env file is used to securely store sensitive information like API keys. Here's the structure of the .env file:

GOOGLE_API_KEY=<your-google-api-key>
  • GOOGLE_API_KEY: This is required to authenticate the use of Google’s Generative AI model (Gemini) to process and analyze the invoice image.

Usage Guide

  1. Input Prompt: In the provided text field, enter a prompt/question you want the model to answer about the invoice.

    Example:

    • "What is the total amount on this invoice?"
    • "Who is the recipient of this invoice?"
  2. Upload Invoice Image: Upload an invoice image in .jpg, .jpeg, or .png format.

  3. Submit: Click the "Tell me about the Invoice" button to process the image and get a response.

  4. Results: The response generated by the Gemini model will be displayed on the page under "The Response is".


Error Handling

  • File Not Uploaded: If no file is uploaded and the user clicks the submit button, a FileNotFoundError will be raised with the message "No file uploaded".

  • Invalid API Key: Ensure that the API key is correctly configured in the .env file. Incorrect API keys will result in authentication errors when interacting with Google Generative AI.

  • Image File Format: Ensure that only supported formats (jpg, jpeg, png) are uploaded. Unsupported formats will lead to file processing errors.


Dependencies

Below is a list of all the libraries and dependencies used in this project, as listed in the requirements.txt file:

  • streamlit: For building the interactive web interface.
  • google-generativeai: To access Google’s Gemini AI model for content generation.
  • python-dotenv: To manage environment variables.
  • langchain: For natural language processing and chaining model queries.
  • pyPDF2: To handle PDF file uploads (for future expansions).
  • chromadb: For vector-based operations (optional, for advanced AI workflows).

Future Features

  • PDF Support: Add support for uploading and extracting data from PDF invoices using pyPDF2.
  • Language Translation: Integrate language detection and translation to support invoices in multiple languages.
  • Model Expansion: Expand the model to support more complex queries and integrate invoice-specific AI models.

About

Austin's Multilanguage Invoice Extractor is a Streamlit-based web application designed to analyze invoice images using Google Gemini AI's generative capabilities. The application allows users to upload invoice images and provides detailed answers to any input queries regarding the uploaded invoices.

Resources

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%