document-processing-pipeline

Here are 8 public repositories matching this topic...

Slayer412 / docling-bedrock-plugin

Integrates AWS Bedrock's multimodal capabilities (Claude 3) into the Docling framework for generating image descriptions within document processing pipelines.

python image-descriptions document-processing-pipeline aws-bedrock docling

Updated Apr 28, 2025
Python

aws-samples / esg-compliance-report-processor

Star

A serverless solution to streamline ESG compliance using AI-driven automation. Built with the AWS CDK (Python), Amazon Textract, Amazon Bedrock, and other AWS services to process and analyse compliance reports.

compliance esg document-processing-pipeline generative-ai

Updated May 28, 2025
Python

ShaliniAnandaPhD / PRISM

Sponsor

Star

pRISM is a repository that combines Retrieval-Augmented Generation (RAG) with a multi-LLM voting approach to create accurate and reliable AI-generated outputs. It integrates multiple language models, including Mistral, Claude 3.5, and OpenAI, to enhance performance through advanced consensus techniques

multi-model lora fine-tuning weighted-majority legal-ai document-processing-pipeline llm phi-4 semantic-ai ai-transparency

Updated Jun 20, 2025
Python

buddywhitman / dist-gcs-pdf-processing

Sponsor

Star

Distributed GCS-GCS multilingual PDF processing service built for horizontal scaling and concurrency, can be deployed using docker compose for voluminous processing

ocr-service google-cloud-platform processing-library gemini-api pdf-document-processor supabase document-processing-pipeline pdf-cleaning

Updated Jul 8, 2025
Python

ArevikKH / PDF-Summarizer-Multilang-OCR

Star

AI-powered system for summarizing PDF content with Armenian, Russian, and English language support. Automatically extracts and summarizes text, applies OCR to images, and identifies visual elements in documents. Built for efficient multilingual PDF processing.