feat: add support for SwissAI Apertus LLM #800

nikochiko · 2025-09-11T13:44:19Z

Q/A checklist

I have tested my UI changes on mobile and they look acceptable
I have tested changes to the workflows in both the API and the UI
I have done a code review of my changes and looked at each line of the diff + the references of each function I have changed
My changes have not increased the import time of the server

How to check import time?

time python -c 'import server'

You can visualize this using tuna:

python3 -X importtime -c 'import server' 2> out.log && tuna out.log

To measure import time for a specific library:

$ time python -c 'import pandas'

________________________________________________________
Executed in    1.15 secs    fish           external
   usr time    2.22 secs   86.00 micros    2.22 secs
   sys time    0.72 secs  613.00 micros    0.72 secs

To reduce import times, import libraries that take a long time inside the functions that use them instead of at the top of the file:

def my_function():
    import pandas as pd
    ...

Legal Boilerplate

Look, I get it. The entity doing business as “Gooey.AI” and/or “Dara.network” was incorporated in the State of Delaware in 2020 as Dara Network Inc. and is gonna need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Dara Network Inc can use, modify, copy, and redistribute my contributions, under its choice of terms.

coderabbitai · 2025-09-11T13:44:27Z

📝 Walkthrough

Walkthrough

Added LargeLanguageModels.apertus_70b_instruct as an LLMSpec for "swiss-ai/apertus-70b-instruct" (LLMApis.openai) with context_window=65_536 and max_output_tokens=4_096.
In run_openai_chat, disabled tool calling (tools = None) for the apertus_70b_instruct model.
Extended get_openai_client to recognize Swiss AI models (model IDs starting with "swiss-ai/") and construct an OpenAI-compatible client pointing at PublicAI using settings.PUBLICAI_API_KEY, base_url "https://api.publicai.co/v1", max_retries=0, and a custom User-Agent.
Added PUBLICAI_API_KEY configuration in settings.py (default empty string).
Added pricing entry for "swiss-ai/apertus-70b-instruct" and introduced ModelProvider.publicai enum member; included a Django migration adjusting ModelPricing choices.
Added "swiss-ai/" prefix to text length mapping in text_splitter.default_length_function.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat: add support for claude 4.1 opus #767 — Similar additions of new LargeLanguageModels entries and provider-specific routing/handling in run_openai_chat/get_openai_client.
feat: add sealion v4 support via API #793 — Adds OpenAI-compatible provider support (new model enum, settings key, pricing entry, and client routing) for another external provider.
fix: cost logging for claude models #685 — Modifies text_splitter.default_length_function to normalize additional vendor model prefixes to "gpt-4o", closely related to the text_splitter change here.

Suggested reviewers

devxpy

Pre-merge checks (2 passed, 1 warning)

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "feat: add support for SwissAI Apertus LLM" is concise, follows a conventional prefix, and accurately summarizes the primary change—integration of SwissAI Apertus across model specs, settings (PUBLICAI_API_KEY), pricing, provider enum, and text-splitting behavior—so a reviewer scanning history will understand the main intent.
Description Check	✅ Passed	The pull request description matches the repository template: it includes the Q/A checklist (all items checked), the detailed "How to check import time" section with commands, and the legal boilerplate, and therefore contains the required sections and sufficient information for reviewers.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch apertus-llm

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

daras_ai_v2/settings.py (1)

514-514: PUBLICAI_API_KEY added — please document and validate missing-key behavior

Looks good. Please add PUBLICAI_API_KEY to your env samples/ops docs and confirm that selecting a swiss-ai/* model without this key surfaces a clear error to the user (not just a 401 from provider).

I can open a docs patch and add a startup check that warns if PUBLICAI_API_KEY is empty.
daras_ai_v2/language_model.py (3)
83-91: New model spec: verify real token limits and JSON/tool support

Spec wiring looks correct. Per your past guidance, please confirm context_window=65,536 and max_output_tokens=4,096 from actual API behavior (not docs), and whether JSON mode is supported. If JSON is supported via PublicAI, set supports_json=True to enable response_format.

I can add an integration check that probes max tokens and JSON mode and auto-adjusts the spec.

1656-1659: Disable tools with a visible warning (don’t fail silently)

Silently nulling tools can confuse callers. Log a warning when tools are provided, so it’s obvious they were ignored for this model.

Apply within this hunk:
 if model == LargeLanguageModels.apertus_70b_instruct:
-    # Swiss AI Apertus model doesn't support tool calling
-    tools = None
+    # Swiss AI Apertus model doesn't support tool calling
+    if tools:
+        logger.warning("Tools are not supported for %s; disabling tool calls.", model.name)
+    tools = None
1934-1941: PublicAI routing: double-check base_url and feature parity (streaming, usage, penalties)

Client wiring looks fine. Please verify:

base_url "https://api.publicai.co/v1" is correct for OpenAI-compatible chat completions (and supports streaming).

Usage tokens are returned in OpenAI shape (so record_openai_llm_usage works).

Frequency/presence penalties are accepted or safely ignored.

I can add a small probe script to hit chat.completions with stream=True and confirm response headers/usage.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e3b9171 and eec933a.

📒 Files selected for processing (2)

daras_ai_v2/language_model.py (3 hunks)
daras_ai_v2/settings.py (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: nikochiko
PR: GooeyAI/gooey-server#768
File: daras_ai_v2/language_model.py:124-126
Timestamp: 2025-08-12T08:22:19.003Z
Learning: When setting token limits for language models in daras_ai_v2/language_model.py, prioritize actual API behavior and platform-specific documentation over general model documentation to ensure consistent UX and avoid API errors.

Learnt from: nikochiko
PR: GooeyAI/gooey-server#768
File: daras_ai_v2/language_model.py:124-126
Timestamp: 2025-08-12T08:22:19.003Z
Learning: When determining language model token limits in daras_ai_v2/language_model.py, prioritize actual API behavior and direct testing over general documentation, as different model variants within the same family may have different practical limits.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: test (3.10.12, 1.8.3)

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

usage_costs/models.py (1)
66-69: Optional: keep IntegerChoices ordered for readability

Placing aks = 5 after publicai = 14 is a bit jarring when scanning. Consider grouping/sorting by id.
 class ModelProvider(models.IntegerChoices):
     openai = 1, "OpenAI"
     google = 2, "Google"
     together_ai = 3, "TogetherAI"
     azure_openai = 4, "Azure OpenAI"
+    aks = 5, "Azure Kubernetes Service"
     anthropic = 6, "Anthropic"
     groq = 7, "groq"
     fireworks = 8, "Fireworks AI"
     mistral = 9, "Mistral AI"
     sarvam = 10, "sarvam.ai"
     fal_ai = 11, "fal.ai"
     twilio = 12, "Twilio"
     sea_lion = 13, "sea-lion.ai"
     publicai = 14, "PublicAI"
-
-    aks = 5, "Azure Kubernetes Service"
usage_costs/migrations/0035_alter_modelpricing_model_name_and_more.py (1)
1-14: Ruff RUF012 on migrations: prefer per-file ignore

Auto-generated migrations trigger RUF012 (ClassVar) lint. Recommend ignoring migrations to avoid churn.

Add to your Ruff config:
+[tool.ruff.lint.per-file-ignores]
+"**/migrations/*.py" = ["RUF012"]
I can raise a tiny PR to add this to pyproject.toml if you want.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eec933a and 114784e.

📒 Files selected for processing (3)

scripts/init_llm_pricing.py (1 hunks)
usage_costs/migrations/0035_alter_modelpricing_model_name_and_more.py (1 hunks)
usage_costs/models.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

scripts/init_llm_pricing.py (2)

daras_ai_v2/language_model.py (1)

LargeLanguageModels (82-1025)

usage_costs/models.py (1)

ModelProvider (53-68)

usage_costs/migrations/0035_alter_modelpricing_model_name_and_more.py (2)

usage_costs/migrations/0033_alter_modelpricing_model_name_and_more.py (1)

Migration (6-297)

usage_costs/migrations/0032_alter_modelpricing_model_name_alter_modelpricing_sku.py (1)

Migration (6-282)

🪛 Ruff (0.12.2)

usage_costs/migrations/0035_alter_modelpricing_model_name_and_more.py

8-13: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

15-320: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: test (3.10.12, 1.8.3)

🔇 Additional comments (5)

usage_costs/models.py (1)

53-69: Add PublicAI provider enum — looks correct

Value 14 is unique and matches the migration and pricing script usage.

scripts/init_llm_pricing.py (2)

16-18: Good move: notes migrated from comment to notes field

Storing this in DB instead of inline comment improves traceability.

20-30: Verify PublicAI pricing & update entry

PublicAI lists swiss‑ai/apertus‑70b‑instruct and notes it’s free during “Swiss AI Weeks” (September 2025); no per‑token pricing published as of 2025-09-11.

Action: do not keep the hardcoded unit_cost_input/unit_cost_output (0.25/2 per 1M) unless you can cite published pricing — either remove or mark them as estimates with an “as of 2025-09-11” note; update pricing_url to a real pricing/docs page (current value points to API endpoints). Confirm with the provider and update the entry.

File: scripts/init_llm_pricing.py Lines: 20-30

usage_costs/migrations/0035_alter_modelpricing_model_name_and_more.py (2)

20-26: Apertus model added to choices — consistent with model enum

"apertus_70b_instruct" display label matches the enum label; migration looks good.

300-315: Provider choices include PublicAI (14) — aligned with models.py

The new provider id matches the enum; no data-mapping concerns.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

daras_ai_v2/text_splitter.py (1)

58-63: Per-thread cache bug: encoder cached for first model only.

threadlocal.enc stores a single encoder; subsequent calls with a different model reuse the wrong encoder, producing incorrect lengths and splits.

Apply per-model caching with a safe fallback:

-    try:
-        enc = threadlocal.enc
-    except AttributeError:
-        enc = tiktoken.encoding_for_model(model)
-        threadlocal.enc = enc
+    try:
+        encoders = threadlocal.encoders
+    except AttributeError:
+        encoders = threadlocal.encoders = {}
+    enc = encoders.get(model)
+    if enc is None:
+        try:
+            enc = tiktoken.encoding_for_model(model)
+        except Exception:
+            # Fallbacks for older tiktoken or unknown model IDs
+            try:
+                enc = tiktoken.get_encoding("o200k_base")
+            except Exception:
+                enc = tiktoken.get_encoding("cl100k_base")
+        encoders[model] = enc

🧹 Nitpick comments (1)

daras_ai_v2/text_splitter.py (1)

44-57: Optional: hoist prefixes to a module constant.

Slight readability/perf win; avoids rebuilding the list on every call and centralizes future additions.

Example:

+MODEL_PREFIXES_MAP_TO_GPT4O = (
+    "o3-","gpt-4.5-","gpt-4.1-","sarvam-","claude-","google/","aisingapore/","swiss-ai/"
+)
 def default_length_function(text: str, model: str = "gpt-4") -> int:
-    if any(
-        model.startswith(prefix)
-        for prefix in [
-            "o3-",
-            "gpt-4.5-",
-            "gpt-4.1-",
-            "sarvam-",
-            "claude-",
-            "google/",
-            "aisingapore/",
-            "swiss-ai/",
-        ]
-    ):
+    if any(model.startswith(prefix) for prefix in MODEL_PREFIXES_MAP_TO_GPT4O):
         model = "gpt-4o"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 114784e and 823a279.

📒 Files selected for processing (1)

daras_ai_v2/text_splitter.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: test (3.10.12, 1.8.3)

🔇 Additional comments (1)

daras_ai_v2/text_splitter.py (1)

44-57: Verify tokenizer parity for 'swiss-ai/' → 'gpt-4o' mapping

Local tiktoken check failed (ModuleNotFoundError). Run the provided tiktoken snippet locally (pip install tiktoken) to compare o200k_base vs cl100k_base on representative text and confirm whether SwissAI/Apertus-70B uses a Llama-style tokenizer; if it does, map 'swiss-ai/' to cl100k_base or a provider-specific fallback instead of gpt-4o. File: daras_ai_v2/text_splitter.py:44-57.

feat: add support for SwissAI Apertus LLM

eec933a

nikochiko requested a review from devxpy September 11, 2025 13:44

nikochiko assigned devxpy Sep 11, 2025

coderabbitai bot reviewed Sep 11, 2025

View reviewed changes

fix: add pricing for apertus model

114784e

coderabbitai bot reviewed Sep 11, 2025

View reviewed changes

fix: use gpt-4o tokenizer for apertus model

823a279

coderabbitai bot reviewed Sep 11, 2025

View reviewed changes

devxpy approved these changes Sep 13, 2025

View reviewed changes

nikochiko merged commit 82333a4 into master Sep 15, 2025
8 checks passed

nikochiko deleted the apertus-llm branch September 15, 2025 08:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add support for SwissAI Apertus LLM #800

feat: add support for SwissAI Apertus LLM #800

Uh oh!

nikochiko commented Sep 11, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Sep 11, 2025 •

edited

Loading

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

feat: add support for SwissAI Apertus LLM #800

feat: add support for SwissAI Apertus LLM #800

Uh oh!

Conversation

nikochiko commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Q/A checklist

Legal Boilerplate

Uh oh!

coderabbitai bot commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks (2 passed, 1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

nikochiko commented Sep 11, 2025 •

edited

Loading

coderabbitai bot commented Sep 11, 2025 •

edited

Loading