Skip to content

Conversation

matdev83
Copy link
Owner

@matdev83 matdev83 commented Oct 12, 2025

Summary

  • handle tuple-style tool arguments when extracting commands in the pytest full suite steering handler
  • add a regression test covering tuple arguments supplied at the top level of the tool call context

Testing

  • python -m pytest tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py
  • python -m pytest (fails: tests/unit/test_config_persistence.py::test_save_and_load_persistent_config and tests/integration/test_versioned_api.py::test_versioned_endpoint_requires_authentication)

https://chatgpt.com/codex/tasks/task_e_68ec25355ce08333b87d0889ca85b9a5

Summary by CodeRabbit

  • New Features
    • Broadened support for command arguments by accepting tuples and other sequences, improving reliability when running the full test suite from varied input formats.
  • Bug Fixes
    • Fixed edge cases where non-list argument inputs weren’t recognized, ensuring consistent detection and execution of the test suite.
  • Tests
    • Added coverage for tuple-based root-level arguments to validate handler behavior and prevent regressions.

@coderabbitai
Copy link

coderabbitai bot commented Oct 12, 2025

Walkthrough

Expanded argument extraction in PytestFullSuiteHandler to treat any non-string/bytes Sequence as command components, not just lists. Updated _extract_command control flow accordingly. Added a unit test to validate root-level tuple arguments are detected and handled, asserting can_handle and should_swallow behaviors.

Changes

Cohort / File(s) Summary
Handler logic update
src/core/services/tool_call_handlers/pytest_full_suite_handler.py
Replaced list-only check with a general Sequence check (excluding str/bytes) in command extraction; imported Sequence; adjusted control flow in _extract_command to join any non-empty Sequence into a command string.
Unit tests
tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py
Added test for tuple-based root-level tool_arguments: verifies can_handle returns True and handle marks should_swallow as True.

Sequence Diagram(s)

sequenceDiagram
  actor Caller
  participant Handler as PytestFullSuiteHandler
  participant Extract as _extract_command

  Caller->>Handler: can_handle(context)
  Handler->>Extract: parse tool_arguments
  alt arguments is non-empty Sequence and not str/bytes
    Extract-->>Handler: "pytest -q" (joined command)
    Handler-->>Caller: True
  else
    Extract-->>Handler: None/other
    Handler-->>Caller: False
  end

  Caller->>Handler: handle(context)
  Handler->>Extract: parse tool_arguments
  Extract-->>Handler: "pytest -q"
  Handler-->>Caller: should_swallow = True
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I thump my paw—tuple treats today,
Sequenced steps hop into the fray.
From list to tuple, commands align,
"pytest -q"—so crisp, so fine.
Carrots up! The suite will run,
Another green hop—review done. 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title clearly and concisely summarizes the primary change by indicating that the pytest full suite handler now supports tuple arguments, which directly reflects the core update in the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/fix-a-bug-in-reactor-tool-mnv67b

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 326702b and 3762ceb.

📒 Files selected for processing (2)
  • src/core/services/tool_call_handlers/pytest_full_suite_handler.py (2 hunks)
  • tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Avoid using emojis in Python code comments or docstrings
Follow PEP 8 and use type hints for all functions
Import order must be: standard library, third-party, then local imports, separated by blank lines
Naming conventions: snake_case for variables and functions; PascalCase for classes
Error handling: use specific exceptions and include meaningful error messages
Prefer f-strings for string formatting

Files:

  • tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py
  • src/core/services/tool_call_handlers/pytest_full_suite_handler.py
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Catch specific exceptions; avoid broad except Exception blocks
If a broad exception must be caught, log with exc_info=True and re-raise a specific custom exception
Use the most specific exception class from src.core.common.exceptions that accurately describes the error
Provide clear, helpful error messages and include relevant details in the details dictionary of custom exceptions

Files:

  • src/core/services/tool_call_handlers/pytest_full_suite_handler.py
🧬 Code graph analysis (1)
tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (2)
src/core/services/tool_call_handlers/pytest_full_suite_handler.py (3)
  • PytestFullSuiteHandler (177-275)
  • can_handle (194-207)
  • handle (209-242)
src/core/interfaces/tool_call_reactor_interface.py (1)
  • ToolCallContext (18-43)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test (3.10)
  • GitHub Check: Analyze (python)
🔇 Additional comments (1)
src/core/services/tool_call_handlers/pytest_full_suite_handler.py (1)

16-16: LGTM! Import follows PEP 8 ordering.

The Sequence import from collections.abc is correctly positioned in the standard library import section.

Comment on lines +85 to 90
if (
isinstance(arguments, Sequence)
and not isinstance(arguments, str | bytes)
and arguments
):
return " ".join(str(item) for item in arguments)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Implementation correctly handles generic sequences.

The change from checking only list to checking any Sequence (excluding str and bytes) is well-implemented. The exclusion of str and bytes is necessary because they're also sequences in Python but shouldn't be split into command components.

However, verify the type contract: ToolCallContext.tool_arguments is typed as dict[str, Any] in the interface (see src/core/interfaces/tool_call_reactor_interface.py lines 17-42), but this code path handles sequences at the root level. This suggests either:

  1. The implementation is being defensive against runtime type contract violations from different LLM providers
  2. The type annotation in ToolCallContext may need updating to dict[str, Any] | Sequence[Any]

Run this script to check if tool_arguments is passed as a non-dict elsewhere in the codebase:


🏁 Script executed:

#!/bin/bash
# Search for ToolCallContext instantiations where tool_arguments might not be a dict

# Look for ToolCallContext construction with tool_arguments parameter
rg -nP --type=py -C3 'ToolCallContext\s*\(' | rg -P 'tool_arguments\s*='

Length of output: 249



🏁 Script executed:

#!/bin/bash
# Display ToolCallContext definition
sed -n '1,200p' src/core/interfaces/tool_call_reactor_interface.py

Length of output: 5079


🏁 Script executed:

#!/bin/bash
# Show context around Sequence handling in pytest_full_suite_handler.py
sed -n '1,200p' src/core/services/tool_call_handlers/pytest_full_suite_handler.py

Length of output: 6419


🏁 Script executed:

#!/bin/bash
# Locate the _extract_pytest_command method
rg -n 'def _extract_pytest_command' -C5 src/core/services/tool_call_handlers/pytest_full_suite_handler.py

Length of output: 426


Update ToolCallContext.tool_arguments annotation to accept both mappings and sequences
Change its type in src/core/interfaces/tool_call_reactor_interface.py from

tool_arguments: dict[str, Any]

to

tool_arguments: dict[str, Any] | Sequence[Any]

so it matches the handler’s support for root-level sequences.

🤖 Prompt for AI Agents
In src/core/services/tool_call_handlers/pytest_full_suite_handler.py around
lines 85 to 90, the handler accepts root-level sequences for tool arguments but
the ToolCallContext.tool_arguments type is still declared as dict[str, Any];
update the annotation in src/core/interfaces/tool_call_reactor_interface.py to
accept either a mapping or a sequence by changing tool_arguments: dict[str, Any]
to tool_arguments: dict[str, Any] | Sequence[Any], and add the necessary import
for Sequence from typing (or collections.abc) to ensure the type resolves.

Comment on lines +146 to +162
@pytest.mark.asyncio
async def test_handler_detects_tuple_arguments_root_level() -> None:
handler = PytestFullSuiteHandler(enabled=True)
context = ToolCallContext(
session_id="session-tuple",
backend_name="backend",
model_name="model",
full_response={},
tool_name="bash",
tool_arguments=("pytest", "-q"),
)

assert await handler.can_handle(context) is True
result = await handler.handle(context)

assert result.should_swallow is True

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Type contract violation: tool_arguments should be dict[str, Any].

The test passes a tuple ("pytest", "-q") directly as tool_arguments, but ToolCallContext.tool_arguments is typed as dict[str, Any] (see src/core/interfaces/tool_call_reactor_interface.py line 38). This creates a type safety violation.

Consider one of these solutions:

  1. If tuples are intentionally supported: Update ToolCallContext.tool_arguments type annotation to dict[str, Any] | Sequence[Any] to reflect the actual supported types
  2. If this is just defensive programming: Keep the test but add a comment explaining that it tests edge cases from providers that violate the type contract
  3. If tuples shouldn't be supported at root level: Wrap the tuple in a dict like existing tests do: tool_arguments={"command": ("pytest", "-q")}

To verify the intended behavior, please clarify:

What is the expected type contract for ToolCallContext.tool_arguments in Python tool call handling frameworks? Should it strictly be a dictionary or support sequences at the root level?
🤖 Prompt for AI Agents
In tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py
around lines 146-162 the test passes a tuple as tool_arguments which violates
the declared ToolCallContext.tool_arguments type (dict[str, Any]); fix the test
by wrapping the tuple inside a dictionary (use a key like "command") so
tool_arguments is a dict, and add a short comment explaining this mirrors the
expected contract; alternatively, if the framework should accept sequences at
the root level, update the ToolCallContext type annotation to include
Sequence[Any] and adjust any callers/tests accordingly—pick one approach and
make the corresponding change and comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant