Skip to content

feat(responses): Experimental Responses API compatibility #51

@alexx-ftw

Description

@alexx-ftw

feat(responses): Experimental Responses API compatibility

Summary

Adds OpenAI Responses API–compatible endpoints behind flags, with local polyfills to work around ChatGPT endpoint constraints.

Endpoints

  • POST /v1/responses -- streams SSE when "stream": true; aggregates JSON when false.
  • GET /v1/responses/{id} -- available when a prior request used "store": true (local storage only).

Features

Input handling: Accepts native Responses input or converts Chat-style messages/prompt.

Tools: Supports function tools plus passthrough responses_tools for web_search/web_search_preview; honors responses_tool_choice (auto/none).

Reasoning: Configurable via --reasoning-effort, --reasoning-summary, --reasoning-compat.

Parameter safety:

  • Strips unsupported token params (max_output_tokens, max_completion_tokens) for the codex upstream.
  • Strips store and previous_response_id before forwarding (upstream rejects these).
  • Implements local polyfills for expected API behavior.
  • Sanitizes upstream response ID references (rs_*) to prevent invalid cross-call references.

ChatGPT Endpoint Constraints (Validated via Live Testing)

This implementation proxies to ChatGPT's internal backend-api/codex/responses endpoint, which has strict limitations:

Test Case Result Error Message
store=false ✅ Accepted (also requires stream=true)
store=true ❌ 400 "Store must be set to false"
store omitted ❌ 400 "Store must be set to false"

Key findings:

  • The ChatGPT endpoint requires store=false and rejects any other value
  • previous_response_id is not supported by the upstream endpoint
  • Only streaming mode (stream=true) is supported
  • This differs from the official OpenAI Platform Responses API, which supports store=true for server-side persistence

Our approach: Implement local storage (_STORE) and threading (_THREADS) to provide expected API behavior while respecting upstream constraints.

CLI Flags

  • --enable-responses-api: mount /v1/responses endpoints (default off).
  • --responses-no-base-instructions: forward client instructions as-is; otherwise base instructions are injected and client instructions are moved into input.
  • --enable-web-search: when responses_tools is omitted, enables default web_search unless responses_tool_choice is "none".

Request for feedback

  • API shape and field mapping, instruction handling defaults, tool passthrough, and downstream client compatibility (e.g., Jan/Raycast) before considering default-on.

Documentation

Comprehensive documentation added covering:

  • ChatGPT endpoint vs official OpenAI API differences
  • Parameter handling and local polyfill behavior
  • Upstream response ID sanitization logic
  • Logging events for debugging

PR: #52

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions