Skip to content

When server_combinations are missing from config.yaml, the models settings are ignored, defaulting to openai and gpt40 #19

@ct-parker

Description

@ct-parker

With the following metacoder configuration, the models parameters are completely ignored. This is caused by omitting the server_combinations parameter. This is unexpected behavior.

Secondly, as with Issue #18 if the OpenAI calls fail for any reason, the code gets caught in a try/fail/repeat loop. This should be documented as a separate bug. It likely has a similar solution as in the cited issue.

Expected behavior could be one (or more) of the following:

  1. Produce an error message like "No server_combinations configuration specified in metacoder input file "tests/input/literature_mcp_eval_config.yaml". This is a required setting. Please read the documentation for how to create this file."
  2. Produce the above error message and also include a script that can independently validate the YAML file and report any errors.
  3. Use all provided information to establish reasonable default parameters (e.g., the models configuration) instead of assuming openai and gpt40.
name: pubmed tools evals
description: |
  Evaluations for multiple pubmed MCPs

coders:
  goose: {}

models:
  claude-sonnet:
    provider: anthropic
    name: claude-sonnet-4-20250514

servers:
  mcp-simple-pubmed:
    name: pubmed
    command: uvx
    args: [mcp-simple-pubmed]
    env:
      PUBMED_EMAIL: ctparker@lbl.gov

# When server_combinations are missing, the models parameter is ignored, defaulting to openai + gpt40
#server_combinations:
#  - [mcp-simple-pubmed]

cases:
- name: PMID_28027860_Full_Text
  metrics: [CorrectnessMetric]
  input: "What is the first sentence of section 2 in PMID: 28027860?"
  expected_output: |
    Even though many of NFLE's core features have been clarified in the last two decades, some critical issues remain controversial."
  threshold: 0.9

The above metacoder configuration results in the following config.yaml in eval_workdir/claude-sonnet_goose_PMID_28027860_Full_Text_no_servers/claude-sonnet_goose_PMID_28027860_Full_Text/.config/goose/:

GOOSE_MODEL: gpt-4o
GOOSE_PROVIDER: openai
extensions:
  developer:
    bundled: true
    display_name: Developer
    enabled: true
    name: developer
    timeout: 300
    type: builtin

If Option 3 (see above) were implemented, this would look like:

GOOSE_MODEL: claude-sonnet-4-20250514
GOOSE_PROVIDER: anthropic
extensions:
  developer:
    bundled: true
    display_name: Developer
    enabled: true
    name: developer
    timeout: 300
    type: builtin

Sub-issues

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions