[Feature] adding code for groq support #44

rushabh31 · 2025-07-20T17:54:55Z

What does this PR do?

This PR adds support for Groq Vision API and Vertex AI (Google Cloud) in the vision-parse library, enabling users to leverage multiple powerful vision models for processing images and PDFs.

Adding support for Groq API for vision models
Adding support for Vertex AI integration for vision models

Features Added

Groq Integration: Implemented full support for Groq's vision models (meta-llama/llama-4-scout-17b-16e-instruct and meta-llama/llama-4-maverick-17b-128e-instruct)
Vertex AI Integration: Added support for Google Cloud's Vertex AI platform with Gemini models (gemini-1.5-pro-002 and gemini-1.5-flash-002)
Flexible Configuration Options:
- Added groq_config parameter to both LLM and VisionParser classes
- Added vertex_config parameter with support for multiple authentication methods (API key, service account JSON, service account key file)
Robust Error Handling:
- Implemented specific error handling for Groq's pixel size limitations
- Added error handling for Vertex AI image size and dimension constraints
- Providing clear guidance to users when images exceed API limits
Documentation & Examples:
- Created example scripts demonstrating both Groq and Vertex AI API usage
- Added test examples for verifying integrations
- Updated dependency management (pyproject.toml) with appropriate requirements
Performance Optimization: Added guidance for proper image resolution settings to stay within API limitations
Page-Level Visual Analysis: Implemented a new workflow to send entire page images to LLMs for detecting and summarizing embedded visuals like images, diagrams, charts, and visualizations
Configurable Visual Summary: Added enable_image_summary parameter to toggle visual element detection and summary generation

Implementation Details

Groq Integration:

Added Groq models to supported models in constants.py
Extended LLM class to include Groq client initialization and request handling
Updated VisionParser to accept and pass through Groq configuration
Added proper error detection and messaging for Groq-specific limitations
Added proper optional dependency for Groq in pyproject.toml

Vertex AI Integration:

Added Vertex AI models to supported models in constants.py
Implemented Vertex AI client initialization with multiple authentication methods
Added _vertex method to handle image processing through Vertex AI
Updated VisionParser to accept and pass vertex_config parameter
Added comprehensive error handling for Vertex AI limitations
Created usage examples for Vertex AI integration
Added proper optional dependencies for Vertex AI in pyproject.toml

Page-Level Visual Analysis:

Implemented new workflow to send entire page images to LLMs for visual element detection
Added page_visuals_prompt template for instructing LLMs to identify and summarize embedded images, charts, diagrams, etc.
Created detect_page_visuals method in LLM class to handle the visual detection and summarization process
Updated parser workflow to integrate visual summaries into the markdown output
Removed legacy individual image extraction code in favor of the more efficient page-level approach

Configurable Visual Summary:

Added enable_image_summary parameter to LLM class with default value of True
Extended VisionParser to accept and pass this parameter to LLM instance
Updated the conversion logic to conditionally perform visual analysis based on the parameter value
Added example usage in documentation to demonstrate how to toggle the feature

Before submitting

This PR improves the library by adding support for new LLM providers (Groq and Vertex AI)
Ran make lint and make format to handle lint / formatting issues
Ran make test to run relevant tests scripts
Read the contributor guidelines
Wrote example code demonstrating the new functionalities
Added tests for both Groq and Vertex AI integrations

Testing

Groq Testing

Manually tested with the Groq API to verify the implementation works correctly
Created unit tests for both the LLM and parser classes to ensure proper integration with Groq
Verified error handling for image size limitations

Vertex AI Testing

Added tests for Vertex AI client initialization with different authentication methods
Created unit tests for both the LLM and parser classes to ensure proper integration with Vertex AI
Verified proper error handling for Vertex AI-specific limitations
Created example script demonstrating usage of Vertex AI with proper configuration

rushabh31 · 2025-07-20T17:55:54Z

@iamarunbrahma Added support for Groq vision models.

rushabh31 added 3 commits July 20, 2025 13:44

adding code for groq support

44da91c

added groq example

46866b7

adding test for groq

4952a79

rushabh31 requested a review from iamarunbrahma as a code owner July 20, 2025 17:54

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Jul 20, 2025

adding groq code to parser

d98a2b4

rushabh31 changed the title ~~adding code for groq support~~ [Feature] adding code for groq support Jul 20, 2025

rushabh31 added 2 commits July 20, 2025 15:37

adding vertexai support and image summary

e0d5bd0

adding vertexai support and image summary

96a6e00

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jul 20, 2025

iamarunbrahma approved these changes Jul 23, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Jul 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] adding code for groq support #44

[Feature] adding code for groq support #44

Uh oh!

rushabh31 commented Jul 20, 2025 •

edited

Loading

Uh oh!

rushabh31 commented Jul 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Feature] adding code for groq support #44

Are you sure you want to change the base?

[Feature] adding code for groq support #44

Uh oh!

Conversation

rushabh31 commented Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Features Added

Implementation Details

Groq Integration:

Vertex AI Integration:

Page-Level Visual Analysis:

Configurable Visual Summary:

Before submitting

Testing

Groq Testing

Vertex AI Testing

Uh oh!

rushabh31 commented Jul 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rushabh31 commented Jul 20, 2025 •

edited

Loading