Skip to content

Conversation

wenxi-onyx
Copy link
Member

@wenxi-onyx wenxi-onyx commented Sep 26, 2025

Description

Some notes:

  • This PR supports both Ollama self hosted and Ollama cloud
  • Ollama no longer requires that the user set a context limit
  • User may access Ollama cloud through their self-hosted server if they do ollama signin --> this means you can access Cloud models without an API key and you can have both cloud and self-hosted models appear with a single LLM configuration
  • All cloud models seem to be universally available and do not cost anything beyond the monthly subscription
  • Regardless of api_base, /api/tags will list available models
    • To see the context limit, you need to POST /api/show with the model name
      • Context size is stored inside model_info: {architecture}.context_limit --> i.e. it is different for every model
  • This PR abstracts the model fetching behavior that is also used for Bedrock
  • Adds image support boolean to the model config table because we can get this info from Ollama and litellm's model map is flaky/not updated frequently

How Has This Been Tested?

[Describe the tests you ran to verify your changes]

Additional Options

  • [Optional] Override Linear Check

Note

Adds official Ollama support with model discovery and vision capability detection, introduces supports_image_input on model configs, and updates UI/backend to fetch and manage provider models dynamically.

  • Database:
    • Add nullable model_configuration.supports_image_input column (with migration) and normalize is_visible nulls to false.
  • Backend:
    • Ollama support: provider constants/options (optional API key, default API base) and provider-specific Authorization header handling.
    • New admin endpoint POST /admin/llm/ollama/available-models to fetch models via Ollama /api/tags and /api/show, deriving max_input_tokens and supports_image_input.
    • Persist supports_image_input in ModelConfiguration; model_supports_image_input now checks DB for Ollama before falling back to litellm.
    • Remove Ollama num_ctx model kwargs; unify extra headers construction.
  • Admin API/UI:
    • Extend provider descriptors (e.g., default_api_base) and require explicit is_visible in model config requests.
    • Refactor model fetching to a dynamic flow (Bedrock, Ollama) and store fetched configurations in form state; auto-fetch on edit; preserve selections; update selectors to use fetched models.
    • Adjust configured provider display logic when default descriptors have empty model lists.
  • Chat/UI:
    • Only require image-capable model selection when image files are present in the message.
  • Tests:
    • Update LLM provider tests to use explicit is_visible and revised expectations.

Written by Cursor Bugbot for commit a9184d8. This will update automatically on new commits. Configure here.

@wenxi-onyx wenxi-onyx requested a review from a team as a code owner September 26, 2025 17:35
Copy link

vercel bot commented Sep 26, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Ready Ready Preview Comment Oct 1, 2025 11:20pm

@wenxi-onyx wenxi-onyx changed the title feat: ollama support feat: ollama official support Sep 26, 2025
cursor[bot]

This comment was marked as outdated.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Summary

This PR adds comprehensive Ollama support to the Onyx platform, enabling both local Ollama server integration and Ollama Cloud connectivity. The implementation spans backend provider configuration, API endpoints for model discovery, and frontend UI enhancements.

Key Changes

  • Backend Integration: Added Ollama provider configuration with authentication support via optional API keys for cloud connections
  • Model Discovery: New API endpoint to fetch available models from Ollama servers with fallback endpoint support
  • UI Enhancement: Frontend form now includes Ollama-specific model fetching functionality with default localhost configuration
  • Type Safety: Added proper TypeScript/Python models for Ollama configuration requests

Issues Found

  • Authentication Bug: Authorization header construction has case sensitivity issue that could prevent proper authentication
  • Missing Authentication: Model fetching API doesn't utilize provided API key for Ollama Cloud connections

The changes follow established patterns in the codebase and properly integrate with existing LLM provider infrastructure.

Confidence Score: 3/5

  • This PR has logical issues that could break Ollama Cloud authentication but is otherwise safe for local Ollama usage
  • Score reflects two authentication bugs that would prevent Ollama Cloud connections from working properly, though local Ollama integration should function correctly
  • backend/onyx/llm/factory.py and backend/onyx/server/manage/llm/api.py require fixes for proper authentication

Important Files Changed

File Analysis

Filename        Score        Overview
backend/onyx/llm/factory.py 4/5 Added Ollama provider support with authentication headers and model kwargs configuration
backend/onyx/server/manage/llm/api.py 4/5 Added Ollama models fetching API endpoint with proper error handling and multiple endpoint support
web/src/app/admin/configuration/llm/LLMProviderUpdateForm.tsx 4/5 Added Ollama-specific UI for model fetching with default API base and model management

Sequence Diagram

sequenceDiagram
    participant U as User/Admin
    participant UI as Frontend UI
    participant API as Backend API
    participant OS as Ollama Server
    participant LF as LLM Factory
    participant Chat as Chat System

    Note over U, Chat: Ollama Provider Setup Flow
    
    U->>UI: Configure Ollama Provider
    UI->>UI: Set default API base (127.0.0.1:11434)
    U->>UI: Click "Fetch Available Models"
    UI->>API: POST /admin/llm/ollama/available-models
    API->>OS: GET /api/tags (try first endpoint)
    alt Success
        OS->>API: Return models list
        API->>API: Extract model names
        API->>UI: Return sorted model names
        UI->>UI: Update model configurations
        UI->>U: Show success message
    else Endpoint not found
        API->>OS: GET /api/models (try second endpoint)
        OS->>API: Return models list
        API->>API: Extract model names
        API->>UI: Return sorted model names
    else Connection/Auth failure
        API->>UI: Return error message
        UI->>U: Show error popup
    end

    Note over U, Chat: Chat Request Flow
    
    U->>Chat: Send chat message
    Chat->>LF: Request LLM instance
    LF->>LF: Build Ollama auth headers (if API key provided)
    LF->>LF: Set num_ctx model kwargs for Ollama
    LF->>Chat: Return configured LLM
    Chat->>OS: Send chat request with headers
    OS->>Chat: Return chat response
    Chat->>U: Display response
Loading

6 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 6 files

Prompt for AI agents (all 3 issues)

Understand the root cause of the following 3 issues and fix them.


<file name="web/src/app/admin/configuration/llm/LLMProviderUpdateForm.tsx">

<violation number="1" location="web/src/app/admin/configuration/llm/LLMProviderUpdateForm.tsx:282">
The feature to fetch available Ollama models does not use the configured API key. This will cause the feature to fail for any user connecting to a secured Ollama instance (e.g., Ollama Cloud), as the request to list models will be unauthenticated. The frontend doesn&#39;t send the key, and the backend API doesn&#39;t use it, despite the API model including a field for it.</violation>

<violation number="2" location="web/src/app/admin/configuration/llm/LLMProviderUpdateForm.tsx:297">
Pass the api_key from form values in the request body so the backend can authenticate to secured Ollama instances when listing models.</violation>
</file>

<file name="backend/onyx/server/manage/llm/api.py">

<violation number="1" location="backend/onyx/server/manage/llm/api.py:509">
Include the Authorization header built from the provided api_key when fetching Ollama models; without it, secured instances will reject the request.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

@wenxi-onyx wenxi-onyx force-pushed the whuang/ollama-support branch from 5aee7ba to 1ef8a32 Compare September 28, 2025 19:01
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

@wenxi-onyx wenxi-onyx force-pushed the whuang/ollama-support branch from 4b0637d to a9184d8 Compare October 1, 2025 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant