-
Notifications
You must be signed in to change notification settings - Fork 2k
Add Ollama support #5414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Ollama support #5414
Conversation
@scorpyy10 is attempting to deploy a commit to the Danswer Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR introduces Ollama provider support to the Onyx LLM ecosystem, along with improvements to JSON response handling and chat input UX enhancements. The changes span both backend and frontend components to enable dynamic model discovery from local Ollama servers.
On the backend, the core addition is in llm_provider_options.py
where Ollama is integrated as a new LLM provider with both static fallback models and dynamic model discovery via fetch_available_ollama_models()
. This function makes HTTP requests to running Ollama servers to retrieve available models. A new admin API endpoint /ollama/models
in api.py
exposes this functionality to the frontend with proper error handling.
The frontend receives a new FetchOllamaModelsButton
component that provides a user-friendly interface for administrators to discover available models from their Ollama server. The existing LLMProviderUpdateForm
is enhanced to support this dynamic model fetching, with new state management for availableModels
and URL normalization logic that adds the http://
scheme when missing for local Ollama servers.
Additional improvements include enhanced JSON response handling in agent_search/shared_graph_utils/llm.py
with better fallback mechanisms for malformed responses, particularly important for LLMs that don't natively support JSON schema. The chat input experience is refined with a dynamic requiresImageGeneration
property that only filters for vision-capable models when images are actually present. A minor modal display logic change in ChatPage.tsx
removes the mutual exclusion between API key and welcome modals.
These changes integrate seamlessly with the existing LLM provider infrastructure, following established patterns while accommodating Ollama's unique characteristics of requiring dynamic model discovery and typically running without authentication on local servers.
Confidence score: 3/5
- This PR requires careful review due to potential performance and error handling issues in critical backend components
- Score lowered due to synchronous HTTP calls in provider configuration loading and inconsistent error handling patterns that could cause blocking behavior
- Pay close attention to
backend/onyx/llm/llm_provider_options.py
for performance implications and error handling improvements
Context used:
Rule - Remove temporary debugging code before merging to production, especially tenant-specific debugging logs. (link)
7 files reviewed, 5 comments
const errorData = await response.json(); | ||
throw new Error(errorData.detail || "Failed to fetch models"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: The error response parsing assumes JSON format but doesn't handle cases where the response isn't valid JSON, which could cause additional errors.
const errorData = await response.json(); | |
throw new Error(errorData.detail || "Failed to fetch models"); | |
let errorMessage = "Failed to fetch models"; | |
try { | |
const errorData = await response.json(); | |
errorMessage = errorData.detail || errorMessage; | |
} catch { | |
// Use default message if JSON parsing fails | |
} | |
throw new Error(errorMessage); |
# Fallback: if schema expects DecisionResponse and model didn't return JSON, | ||
# create a sensible default using raw content as reasoning | ||
try: | ||
if schema.__name__ == "DecisionResponse": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: String comparison of schema names is fragile - consider using isinstance() or hasattr() checks instead
"""Endpoints for all""" | ||
|
||
|
||
@admin_router.get("/ollama/models") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Consider using a synchronous function instead of async
since fetch_available_ollama_models
is not async and no awaitable operations are performed
@admin_router.get("/ollama/models") | |
@admin_router.get("/ollama/models") | |
def fetch_ollama_models( |
if provider_name == OLLAMA_PROVIDER_NAME: | ||
try: | ||
# Get the actual models from the Ollama server | ||
ollama_models = fetch_available_ollama_models() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Calling fetch_available_ollama_models()
here means every provider configuration load triggers an HTTP request. This could cause performance issues when listing all providers.
import requests | ||
from typing import List, Optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Import order should follow PEP8 - standard library imports (typing) should come before third-party imports (requests).
import requests | |
from typing import List, Optional | |
from typing import List, Optional | |
import requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7 issues found across 7 files
Prompt for AI agents (all 7 issues)
Understand the root cause of the following 7 issues and fix them.
<file name="web/src/app/admin/configuration/llm/LLMProviderUpdateForm.tsx">
<violation number="1" location="web/src/app/admin/configuration/llm/LLMProviderUpdateForm.tsx:215">
Implicitly prefixing http:// for all providers can downgrade security and send credentials over plaintext; restrict the auto-scheme behavior to Ollama only.</violation>
</file>
<file name="backend/onyx/agents/agent_search/shared_graph_utils/llm.py">
<violation number="1" location="backend/onyx/agents/agent_search/shared_graph_utils/llm.py:206">
This comment claims the original error will be raised, but the code swallows the exception with pass, obscuring the root cause; either re-raise or update the comment to reflect behavior.</violation>
</file>
<file name="backend/onyx/llm/llm_provider_options.py">
<violation number="1" location="backend/onyx/llm/llm_provider_options.py:222">
Replace print with the project logger and avoid logging raw exception strings; prefer sanitized context (e.g., status code) or a concise message.
(Based on your team's feedback about not logging raw exceptions that may contain URLs/tokens.)</violation>
<violation number="2" location="backend/onyx/llm/llm_provider_options.py:378">
Avoid making a network call during provider configuration retrieval; this executes on every provider listing and can block the admin UI. Use static defaults or a cached/explicit fetch instead of hitting the Ollama API here.</violation>
<violation number="3" location="backend/onyx/llm/llm_provider_options.py:393">
Use the project's logger instead of print and avoid logging raw exception details that may include sensitive information.
(Based on your team's feedback about not logging raw exceptions that may contain URLs/tokens.)</violation>
</file>
<file name="web/src/app/admin/configuration/llm/FetchOllamaModelsButton.tsx">
<violation number="1" location="web/src/app/admin/configuration/llm/FetchOllamaModelsButton.tsx:51">
Avoid logging the raw exception object; it can expose sensitive data. Log a generic message or safe metadata instead.
(Based on your team's feedback about not logging raw exception strings that may contain URLs with temporary auth tokens.)</violation>
</file>
<file name="backend/onyx/server/manage/llm/api.py">
<violation number="1" location="backend/onyx/server/manage/llm/api.py:323">
Async endpoint calls a blocking requests-based helper, risking event loop blocking. Make the endpoint sync (def) so FastAPI runs it in a threadpool, or switch to an async HTTP client.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Ask questions if you need clarification on any suggestion
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai
to give feedback, ask questions, or re-run the review.
} | ||
|
||
// Normalize api_base: ensure scheme for providers like Ollama | ||
if (finalApiBase) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implicitly prefixing http:// for all providers can downgrade security and send credentials over plaintext; restrict the auto-scheme behavior to Ollama only.
Prompt for AI agents
Address the following comment on web/src/app/admin/configuration/llm/LLMProviderUpdateForm.tsx at line 215:
<comment>Implicitly prefixing http:// for all providers can downgrade security and send credentials over plaintext; restrict the auto-scheme behavior to Ollama only.</comment>
<file context>
@@ -209,16 +211,30 @@ export function LLMProviderUpdateForm({
}
+ // Normalize api_base: ensure scheme for providers like Ollama
+ if (finalApiBase) {
+ const hasScheme = /^https?:\/\//i.test(finalApiBase);
+ if (!hasScheme) {
</file context>
if (finalApiBase) { | |
if (finalApiBase && llmProviderDescriptor.name === "ollama") { |
✅ Addressed in 7fad3c7
decision="LLM", | ||
)) | ||
except Exception: | ||
# If even fallback construction fails, raise the original error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment claims the original error will be raised, but the code swallows the exception with pass, obscuring the root cause; either re-raise or update the comment to reflect behavior.
Prompt for AI agents
Address the following comment on backend/onyx/agents/agent_search/shared_graph_utils/llm.py at line 206:
<comment>This comment claims the original error will be raised, but the code swallows the exception with pass, obscuring the root cause; either re-raise or update the comment to reflect behavior.</comment>
<file context>
@@ -177,9 +179,38 @@ def invoke_llm_json(
+ decision="LLM",
+ ))
+ except Exception:
+ # If even fallback construction fails, raise the original error
+ pass
+
</file context>
# If even fallback construction fails, raise the original error | |
# If fallback construction fails, proceed and raise a descriptive error below |
for model_name in ollama_models | ||
] | ||
except Exception as e: | ||
print(f"Error fetching Ollama models: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the project's logger instead of print and avoid logging raw exception details that may include sensitive information.
(Based on your team's feedback about not logging raw exceptions that may contain URLs/tokens.)
Prompt for AI agents
Address the following comment on backend/onyx/llm/llm_provider_options.py at line 393:
<comment>Use the project's logger instead of print and avoid logging raw exception details that may include sensitive information.
(Based on your team's feedback about not logging raw exceptions that may contain URLs/tokens.)</comment>
<file context>
@@ -298,9 +371,28 @@ def fetch_visible_model_names_for_provider_as_set(
+ for model_name in ollama_models
+ ]
+ except Exception as e:
+ print(f"Error fetching Ollama models: {e}")
+
+ # Fallback for other providers
</file context>
✅ Addressed in 7fad3c7
except requests.exceptions.RequestException as e: | ||
print(f"Error connecting to Ollama API at {base_url}: {e}") | ||
except Exception as e: | ||
print(f"Unexpected error fetching Ollama models: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace print with the project logger and avoid logging raw exception strings; prefer sanitized context (e.g., status code) or a concise message.
(Based on your team's feedback about not logging raw exceptions that may contain URLs/tokens.)
Prompt for AI agents
Address the following comment on backend/onyx/llm/llm_provider_options.py at line 222:
<comment>Replace print with the project logger and avoid logging raw exception strings; prefer sanitized context (e.g., status code) or a concise message.
(Based on your team's feedback about not logging raw exceptions that may contain URLs/tokens.)</comment>
<file context>
@@ -154,24 +156,95 @@ class WellKnownLLMProviderDescriptor(BaseModel):
+ except requests.exceptions.RequestException as e:
+ print(f"Error connecting to Ollama API at {base_url}: {e}")
+ except Exception as e:
+ print(f"Unexpected error fetching Ollama models: {e}")
+ # Return default models if API call fails
+ return OLLAMA_MODEL_NAMES
</file context>
✅ Addressed in 7fad3c7
type: "success", | ||
}); | ||
} catch (error) { | ||
console.error("Error fetching Ollama models:", error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid logging the raw exception object; it can expose sensitive data. Log a generic message or safe metadata instead.
(Based on your team's feedback about not logging raw exception strings that may contain URLs with temporary auth tokens.)
Prompt for AI agents
Address the following comment on web/src/app/admin/configuration/llm/FetchOllamaModelsButton.tsx at line 51:
<comment>Avoid logging the raw exception object; it can expose sensitive data. Log a generic message or safe metadata instead.
(Based on your team's feedback about not logging raw exception strings that may contain URLs with temporary auth tokens.)</comment>
<file context>
@@ -0,0 +1,79 @@
+ type: "success",
+ });
+ } catch (error) {
+ console.error("Error fetching Ollama models:", error);
+ setPopup({
+ message: `Error fetching models: ${error instanceof Error ? error.message : String(error)}`,
</file context>
console.error("Error fetching Ollama models:", error); | |
console.error("Error fetching Ollama models"); |
✅ Addressed in 7fad3c7
|
||
|
||
@admin_router.get("/ollama/models") | ||
async def fetch_ollama_models( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Async endpoint calls a blocking requests-based helper, risking event loop blocking. Make the endpoint sync (def) so FastAPI runs it in a threadpool, or switch to an async HTTP client.
Prompt for AI agents
Address the following comment on backend/onyx/server/manage/llm/api.py at line 323:
<comment>Async endpoint calls a blocking requests-based helper, risking event loop blocking. Make the endpoint sync (def) so FastAPI runs it in a threadpool, or switch to an async HTTP client.</comment>
<file context>
@@ -318,6 +319,33 @@ def get_vision_capable_providers(
+@admin_router.get("/ollama/models")
+async def fetch_ollama_models(
+ api_base: str = Query(
+ "http://localhost:11434",
</file context>
async def fetch_ollama_models( | |
def fetch_ollama_models( |
✅ Addressed in 7fad3c7
if provider_name == OLLAMA_PROVIDER_NAME: | ||
try: | ||
# Get the actual models from the Ollama server | ||
ollama_models = fetch_available_ollama_models() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid making a network call during provider configuration retrieval; this executes on every provider listing and can block the admin UI. Use static defaults or a cached/explicit fetch instead of hitting the Ollama API here.
Prompt for AI agents
Address the following comment on backend/onyx/llm/llm_provider_options.py at line 378:
<comment>Avoid making a network call during provider configuration retrieval; this executes on every provider listing and can block the admin UI. Use static defaults or a cached/explicit fetch instead of hitting the Ollama API here.</comment>
<file context>
@@ -298,9 +371,28 @@ def fetch_visible_model_names_for_provider_as_set(
+ if provider_name == OLLAMA_PROVIDER_NAME:
+ try:
+ # Get the actual models from the Ollama server
+ ollama_models = fetch_available_ollama_models()
+ if ollama_models:
+ return [
</file context>
✅ Addressed in 7fad3c7
ba8c035
to
7fad3c7
Compare
Hi Yash, appreciate the PR. I believe you may have deleted the actual Ollama support and only left the changes I asked you to remove. I'm closing this one as I've opened #5509 --> this ended up being a bit complex for a first PR (apologies for that). |
Description
This PR adds Ollama support, improves JSON response handling across LLM endpoints, and enhances the chat input UX.
Backend (onyx/backend/onyx/server/manage/llm/api.py)
-Add Ollama provider integration and model discovery/fetch support
-Standardize JSON response shape and error handling
-Clearer status codes and validation messages
Web – Admin
-onyx/web/src/app/admin/configuration/llm/FetchOllamaModelsButton.tsx: new action to fetch available Ollama models with proper loading and error states
-onyx/web/src/app/admin/configuration/llm/LLMProviderUpdateForm.tsx: improved validation, state sync, and UX hints
Web – Chat
-onyx/web/src/app/chat/components/input/ChatInputBar.tsx: improved Enter vs Shift+Enter handling, better IME/composition behavior, safer paste handling, minor a11y tweaks
How Has This Been Tested?
Web – Admin
“Fetch Ollama Models”:
Shows loading state, renders models on success, displays error toast/message on failure
Provider form:
Valid/invalid inputs show inline validation; save/reset behavior is consistent
Web – Chat
ChatInputBar:
Enter sends; Shift+Enter adds newline
IME composition doesn’t prematurely send
Pasting large text doesn’t freeze input and preserves formatting
Tab focus order and ARIA labels checked
Summary by cubic
Adds first-class Ollama support with model discovery, hardens JSON LLM responses, and polishes chat input/model picker UX.
New Features
Bug Fixes