Skip to content

Conversation

nikochiko
Copy link
Member

@nikochiko nikochiko commented Sep 11, 2025

Q/A checklist

  • I have tested my UI changes on mobile and they look acceptable
  • I have tested changes to the workflows in both the API and the UI
  • I have done a code review of my changes and looked at each line of the diff + the references of each function I have changed
  • My changes have not increased the import time of the server
How to check import time?

time python -c 'import server'

You can visualize this using tuna:

python3 -X importtime -c 'import server' 2> out.log && tuna out.log

To measure import time for a specific library:

$ time python -c 'import pandas'

________________________________________________________
Executed in    1.15 secs    fish           external
   usr time    2.22 secs   86.00 micros    2.22 secs
   sys time    0.72 secs  613.00 micros    0.72 secs

To reduce import times, import libraries that take a long time inside the functions that use them instead of at the top of the file:

def my_function():
    import pandas as pd
    ...

Legal Boilerplate

Look, I get it. The entity doing business as “Gooey.AI” and/or “Dara.network” was incorporated in the State of Delaware in 2020 as Dara Network Inc. and is gonna need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Dara Network Inc can use, modify, copy, and redistribute my contributions, under its choice of terms.

Copy link

coderabbitai bot commented Sep 11, 2025

📝 Walkthrough

Walkthrough

  • Added LargeLanguageModels.apertus_70b_instruct as an LLMSpec for "swiss-ai/apertus-70b-instruct" (LLMApis.openai) with context_window=65_536 and max_output_tokens=4_096.
  • In run_openai_chat, disabled tool calling (tools = None) for the apertus_70b_instruct model.
  • Extended get_openai_client to recognize Swiss AI models (model IDs starting with "swiss-ai/") and construct an OpenAI-compatible client pointing at PublicAI using settings.PUBLICAI_API_KEY, base_url "https://api.publicai.co/v1", max_retries=0, and a custom User-Agent.
  • Added PUBLICAI_API_KEY configuration in settings.py (default empty string).
  • Added pricing entry for "swiss-ai/apertus-70b-instruct" and introduced ModelProvider.publicai enum member; included a Django migration adjusting ModelPricing choices.
  • Added "swiss-ai/" prefix to text length mapping in text_splitter.default_length_function.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • devxpy

Pre-merge checks (2 passed, 1 warning)

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title "feat: add support for SwissAI Apertus LLM" is concise, follows a conventional prefix, and accurately summarizes the primary change—integration of SwissAI Apertus across model specs, settings (PUBLICAI_API_KEY), pricing, provider enum, and text-splitting behavior—so a reviewer scanning history will understand the main intent.
Description Check ✅ Passed The pull request description matches the repository template: it includes the Q/A checklist (all items checked), the detailed "How to check import time" section with commands, and the legal boilerplate, and therefore contains the required sections and sufficient information for reviewers.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch apertus-llm

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
daras_ai_v2/settings.py (1)

514-514: PUBLICAI_API_KEY added — please document and validate missing-key behavior

Looks good. Please add PUBLICAI_API_KEY to your env samples/ops docs and confirm that selecting a swiss-ai/* model without this key surfaces a clear error to the user (not just a 401 from provider).

I can open a docs patch and add a startup check that warns if PUBLICAI_API_KEY is empty.

daras_ai_v2/language_model.py (3)

83-91: New model spec: verify real token limits and JSON/tool support

Spec wiring looks correct. Per your past guidance, please confirm context_window=65,536 and max_output_tokens=4,096 from actual API behavior (not docs), and whether JSON mode is supported. If JSON is supported via PublicAI, set supports_json=True to enable response_format.

I can add an integration check that probes max tokens and JSON mode and auto-adjusts the spec.


1656-1659: Disable tools with a visible warning (don’t fail silently)

Silently nulling tools can confuse callers. Log a warning when tools are provided, so it’s obvious they were ignored for this model.

Apply within this hunk:

 if model == LargeLanguageModels.apertus_70b_instruct:
-    # Swiss AI Apertus model doesn't support tool calling
-    tools = None
+    # Swiss AI Apertus model doesn't support tool calling
+    if tools:
+        logger.warning("Tools are not supported for %s; disabling tool calls.", model.name)
+    tools = None

1934-1941: PublicAI routing: double-check base_url and feature parity (streaming, usage, penalties)

Client wiring looks fine. Please verify:

  • base_url "https://api.publicai.co/v1" is correct for OpenAI-compatible chat completions (and supports streaming).
  • Usage tokens are returned in OpenAI shape (so record_openai_llm_usage works).
  • Frequency/presence penalties are accepted or safely ignored.

I can add a small probe script to hit chat.completions with stream=True and confirm response headers/usage.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e3b9171 and eec933a.

📒 Files selected for processing (2)
  • daras_ai_v2/language_model.py (3 hunks)
  • daras_ai_v2/settings.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: nikochiko
PR: GooeyAI/gooey-server#768
File: daras_ai_v2/language_model.py:124-126
Timestamp: 2025-08-12T08:22:19.003Z
Learning: When setting token limits for language models in daras_ai_v2/language_model.py, prioritize actual API behavior and platform-specific documentation over general model documentation to ensure consistent UX and avoid API errors.
Learnt from: nikochiko
PR: GooeyAI/gooey-server#768
File: daras_ai_v2/language_model.py:124-126
Timestamp: 2025-08-12T08:22:19.003Z
Learning: When determining language model token limits in daras_ai_v2/language_model.py, prioritize actual API behavior and direct testing over general documentation, as different model variants within the same family may have different practical limits.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test (3.10.12, 1.8.3)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
usage_costs/models.py (1)

66-69: Optional: keep IntegerChoices ordered for readability

Placing aks = 5 after publicai = 14 is a bit jarring when scanning. Consider grouping/sorting by id.

 class ModelProvider(models.IntegerChoices):
     openai = 1, "OpenAI"
     google = 2, "Google"
     together_ai = 3, "TogetherAI"
     azure_openai = 4, "Azure OpenAI"
+    aks = 5, "Azure Kubernetes Service"
     anthropic = 6, "Anthropic"
     groq = 7, "groq"
     fireworks = 8, "Fireworks AI"
     mistral = 9, "Mistral AI"
     sarvam = 10, "sarvam.ai"
     fal_ai = 11, "fal.ai"
     twilio = 12, "Twilio"
     sea_lion = 13, "sea-lion.ai"
     publicai = 14, "PublicAI"
-
-    aks = 5, "Azure Kubernetes Service"
usage_costs/migrations/0035_alter_modelpricing_model_name_and_more.py (1)

1-14: Ruff RUF012 on migrations: prefer per-file ignore

Auto-generated migrations trigger RUF012 (ClassVar) lint. Recommend ignoring migrations to avoid churn.

Add to your Ruff config:

+[tool.ruff.lint.per-file-ignores]
+"**/migrations/*.py" = ["RUF012"]

I can raise a tiny PR to add this to pyproject.toml if you want.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eec933a and 114784e.

📒 Files selected for processing (3)
  • scripts/init_llm_pricing.py (1 hunks)
  • usage_costs/migrations/0035_alter_modelpricing_model_name_and_more.py (1 hunks)
  • usage_costs/models.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
scripts/init_llm_pricing.py (2)
daras_ai_v2/language_model.py (1)
  • LargeLanguageModels (82-1025)
usage_costs/models.py (1)
  • ModelProvider (53-68)
usage_costs/migrations/0035_alter_modelpricing_model_name_and_more.py (2)
usage_costs/migrations/0033_alter_modelpricing_model_name_and_more.py (1)
  • Migration (6-297)
usage_costs/migrations/0032_alter_modelpricing_model_name_alter_modelpricing_sku.py (1)
  • Migration (6-282)
🪛 Ruff (0.12.2)
usage_costs/migrations/0035_alter_modelpricing_model_name_and_more.py

8-13: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


15-320: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test (3.10.12, 1.8.3)
🔇 Additional comments (5)
usage_costs/models.py (1)

53-69: Add PublicAI provider enum — looks correct

Value 14 is unique and matches the migration and pricing script usage.

scripts/init_llm_pricing.py (2)

16-18: Good move: notes migrated from comment to notes field

Storing this in DB instead of inline comment improves traceability.


20-30: Verify PublicAI pricing & update entry

  • PublicAI lists swiss‑ai/apertus‑70b‑instruct and notes it’s free during “Swiss AI Weeks” (September 2025); no per‑token pricing published as of 2025-09-11.
  • Action: do not keep the hardcoded unit_cost_input/unit_cost_output (0.25/2 per 1M) unless you can cite published pricing — either remove or mark them as estimates with an “as of 2025-09-11” note; update pricing_url to a real pricing/docs page (current value points to API endpoints). Confirm with the provider and update the entry.

File: scripts/init_llm_pricing.py Lines: 20-30

usage_costs/migrations/0035_alter_modelpricing_model_name_and_more.py (2)

20-26: Apertus model added to choices — consistent with model enum

"apertus_70b_instruct" display label matches the enum label; migration looks good.


300-315: Provider choices include PublicAI (14) — aligned with models.py

The new provider id matches the enum; no data-mapping concerns.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
daras_ai_v2/text_splitter.py (1)

58-63: Per-thread cache bug: encoder cached for first model only.

threadlocal.enc stores a single encoder; subsequent calls with a different model reuse the wrong encoder, producing incorrect lengths and splits.

Apply per-model caching with a safe fallback:

-    try:
-        enc = threadlocal.enc
-    except AttributeError:
-        enc = tiktoken.encoding_for_model(model)
-        threadlocal.enc = enc
+    try:
+        encoders = threadlocal.encoders
+    except AttributeError:
+        encoders = threadlocal.encoders = {}
+    enc = encoders.get(model)
+    if enc is None:
+        try:
+            enc = tiktoken.encoding_for_model(model)
+        except Exception:
+            # Fallbacks for older tiktoken or unknown model IDs
+            try:
+                enc = tiktoken.get_encoding("o200k_base")
+            except Exception:
+                enc = tiktoken.get_encoding("cl100k_base")
+        encoders[model] = enc
🧹 Nitpick comments (1)
daras_ai_v2/text_splitter.py (1)

44-57: Optional: hoist prefixes to a module constant.

Slight readability/perf win; avoids rebuilding the list on every call and centralizes future additions.

Example:

+MODEL_PREFIXES_MAP_TO_GPT4O = (
+    "o3-","gpt-4.5-","gpt-4.1-","sarvam-","claude-","google/","aisingapore/","swiss-ai/"
+)
 def default_length_function(text: str, model: str = "gpt-4") -> int:
-    if any(
-        model.startswith(prefix)
-        for prefix in [
-            "o3-",
-            "gpt-4.5-",
-            "gpt-4.1-",
-            "sarvam-",
-            "claude-",
-            "google/",
-            "aisingapore/",
-            "swiss-ai/",
-        ]
-    ):
+    if any(model.startswith(prefix) for prefix in MODEL_PREFIXES_MAP_TO_GPT4O):
         model = "gpt-4o"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 114784e and 823a279.

📒 Files selected for processing (1)
  • daras_ai_v2/text_splitter.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test (3.10.12, 1.8.3)
🔇 Additional comments (1)
daras_ai_v2/text_splitter.py (1)

44-57: Verify tokenizer parity for 'swiss-ai/' → 'gpt-4o' mapping

Local tiktoken check failed (ModuleNotFoundError). Run the provided tiktoken snippet locally (pip install tiktoken) to compare o200k_base vs cl100k_base on representative text and confirm whether SwissAI/Apertus-70B uses a Llama-style tokenizer; if it does, map 'swiss-ai/' to cl100k_base or a provider-specific fallback instead of gpt-4o. File: daras_ai_v2/text_splitter.py:44-57.

@nikochiko nikochiko merged commit 82333a4 into master Sep 15, 2025
8 checks passed
@nikochiko nikochiko deleted the apertus-llm branch September 15, 2025 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants