Multilanguage Invoice Extractor Documentation

Project Overview

Austin's Multilanguage Invoice Extractor is a Streamlit-based web application designed to analyze invoice images using Google Gemini AI's generative capabilities. The application allows users to upload invoice images and provides detailed answers to any input queries regarding the uploaded invoices.

Getting Started

This application provides a streamlined approach to extract, understand, and analyze invoice images in multiple languages using a combination of Google’s Generative AI (Gemini model), Python libraries for handling PDFs, and machine learning tools.

The application takes as input an image of an invoice and a user prompt (question), and it generates a detailed response based on the content of the invoice image. This is done using Google's Generative AI.

File Structure

app.py: The main Streamlit app that handles user interaction and invoice image processing.
requirements.txt: Lists the required Python libraries and dependencies.
.env: Contains the environment variables, including the Google API key.
venv: Python virtual environment for isolated package management.
README.md: Documentation for the application.

Installation Guide

1. Clone the Repository

To start, clone the repository to your local machine:

git clone <repository-url>
cd multilanguage-invoice-extractor

2. Set up the Virtual Environment

You can use a virtual environment to manage dependencies. If not created, do the following:

python3 -m venv venv
source venv/bin/activate   # For Mac/Linux
# or
venv\Scripts\activate.bat  # For Windows

3. Install Dependencies

Install the necessary dependencies using pip from the requirements.txt file:

pip install -r requirements.txt

This will install all the required libraries like streamlit, google-generativeai, langchain, etc.

4. Configure Environment Variables

You will need to create a .env file to store your API key for Google Generative AI. Make sure your .env file contains the following:

GOOGLE_API_KEY=<your-google-api-key>

Replace <your-google-api-key> with your actual Google Generative AI API key.

Running the Application

Once all dependencies are installed and the environment is configured, you can run the application using Streamlit. Use the following command to launch the app:

streamlit run app.py

This will launch the Streamlit app in your default web browser.

Environment Variables

The .env file is used to securely store sensitive information like API keys. Here's the structure of the .env file:

GOOGLE_API_KEY=<your-google-api-key>

GOOGLE_API_KEY: This is required to authenticate the use of Google’s Generative AI model (Gemini) to process and analyze the invoice image.

Usage Guide

Input Prompt: In the provided text field, enter a prompt/question you want the model to answer about the invoice.

Example:
- "What is the total amount on this invoice?"
- "Who is the recipient of this invoice?"
Upload Invoice Image: Upload an invoice image in .jpg, .jpeg, or .png format.
Submit: Click the "Tell me about the Invoice" button to process the image and get a response.
Results: The response generated by the Gemini model will be displayed on the page under "The Response is".

Error Handling

File Not Uploaded: If no file is uploaded and the user clicks the submit button, a FileNotFoundError will be raised with the message "No file uploaded".
Invalid API Key: Ensure that the API key is correctly configured in the .env file. Incorrect API keys will result in authentication errors when interacting with Google Generative AI.
Image File Format: Ensure that only supported formats (jpg, jpeg, png) are uploaded. Unsupported formats will lead to file processing errors.

Dependencies

Below is a list of all the libraries and dependencies used in this project, as listed in the requirements.txt file:

streamlit: For building the interactive web interface.
google-generativeai: To access Google’s Gemini AI model for content generation.
python-dotenv: To manage environment variables.
langchain: For natural language processing and chaining model queries.
pyPDF2: To handle PDF file uploads (for future expansions).
chromadb: For vector-based operations (optional, for advanced AI workflows).

Future Features

PDF Support: Add support for uploading and extracting data from PDF invoices using pyPDF2.
Language Translation: Integrate language detection and translation to support invoices in multiple languages.
Model Expansion: Expand the model to support more complex queries and integrate invoice-specific AI models.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multilanguage Invoice Extractor Documentation

Project Overview

Table of Contents

Getting Started

File Structure

Installation Guide

1. Clone the Repository

2. Set up the Virtual Environment

3. Install Dependencies

4. Configure Environment Variables

Running the Application

Environment Variables

Usage Guide

Error Handling

Dependencies

Future Features

About

Uh oh!

Releases

Packages

Languages

Algorithmia-SE/Invoice-Extractor-LLM-App

Folders and files

Latest commit

History

Repository files navigation

Multilanguage Invoice Extractor Documentation

Project Overview

Table of Contents

Getting Started

File Structure

Installation Guide

1. Clone the Repository

2. Set up the Virtual Environment

3. Install Dependencies

4. Configure Environment Variables

Running the Application

Environment Variables

Usage Guide

Error Handling

Dependencies

Future Features

About

Resources

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages