Skip to content

Conversation

dmentx
Copy link

@dmentx dmentx commented Sep 17, 2025

What does this PR do?

In short, provide a summary of what this PR does an
This pull request adds support for using vLLM, an OpenAI-compatible inference server, as a backend for vision parsing. The changes include dependency and configuration updates, logic for selecting and initializing the vLLM client, error handling improvements, and new tests to ensure vLLM integration works both in unit and integration scenarios.

Key changes include:

vLLM Support and Client Initialization:

  • Added vllm as an optional dependency and included it in the pyproject.toml for installation and testing.
  • Updated the model-to-provider mapping in constants.py to associate the new unsloth/Mistral-Small-3.1-24B-Instruct-2503-bnb-4bit model with the vllm provider.
  • Refactored llm.py to support initialization and usage of vLLM as a provider, including dynamic import and fallback to OpenAI client classes if vLLM does not expose them. [1] [2] [3]
  • Adjusted logic for client instantiation, API key, and base URL selection to support vLLM-specific environment variables and defaults.

Request Routing and Error Handling:

  • Updated routing logic to treat openai and vllm providers equivalently for vision model requests.
  • Improved error handling and messaging to distinguish between OpenAI and vLLM failures.

Testing and Integration:

  • Added a unit test to verify markdown generation using a vLLM endpoint, including environment variable handling and client instantiation checks.
  • Introduced a new integration test to exercise the full VisionParser stack against a live vLLM endpoint, with connection checks and test skipping if no endpoint is available.

Configuration and Test Infrastructure:

  • Added a pytest marker for integration tests and updated test configuration in pyproject.toml.

These changes collectively enable users to run vision parsing tasks using vLLM endpoints with minimal configuration changes and provide robust test coverage for this new backend.d why. Usually, the relevant context should be present in a linked issue.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Ran make lint and make format to handle lint / formatting issues.
  • Ran make test to run relevant tests scripts.
  • Read the contributor guidelines.
  • Wrote necessary unit or integration tests.

- Updated test cases in  to enhance coverage and improve structure.
- Added new integration tests for vLLM in  to validate functionality against a live endpoint.
- Improved error handling and assertions in existing tests for better reliability.
- Ensured compatibility with new configurations and models in the VisionParser.
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Sep 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant