Skip to content

fix: useChat status stays ready during stream resumption#999

Merged
threepointone merged 1 commit intomainfrom
fix/resume-stream-status
Feb 28, 2026
Merged

fix: useChat status stays ready during stream resumption#999
threepointone merged 1 commit intomainfrom
fix/resume-stream-status

Conversation

@threepointone
Copy link
Contributor

@threepointone threepointone commented Feb 26, 2026

Problem

useChat status stayed "ready" during stream resumption after page refresh — isLoading was false, no abort button or thinking indicator appeared.

Four root causes:

1. addEventListener race

The transport registered its own addEventListener listener for CF_AGENT_STREAM_RESUMING detection, but onAgentMessage (also via addEventListener) always handled the message first. The fallback path ran — bypassing the AI SDK pipeline entirely. Chunks flowed through onAgentMessagesetMessages directly, so useChat never saw them and never set status to "streaming".

2. Transport instance instability

useMemo created new transport instances across renders and Strict Mode cycles. When _pk changed (async queries, socket recreation), the resolver was stranded on the old transport while onAgentMessage called handleStreamResuming on the new one — it never found the resolver.

3. Chat recreation on _pk change

Using agent._pk as the useChat id caused the AI SDK to recreate the Chat when the socket changed (shouldRecreateChat: chatRef.current.id !== options.id). This abandoned the in-flight makeRequest (including resume). The resume effect wouldn't re-fire because its deps are [resume, chatRef]chatRef is the same ref object, so the effect never re-runs.

4. Double STREAM_RESUMING

The server sends CF_AGENT_STREAM_RESUMING from both onConnect (line 348) and the RESUME_REQUEST handler (line 551). Without deduplication, the second message triggered a duplicate ACK and double chunk replay.

Fix

addEventListener race → synchronous callback

Replaced the transport's addEventListener-based detection with handleStreamResuming() — a public method that onAgentMessage calls directly:

Server → WebSocket → PartySocket → onAgentMessage
                                        ↓
                              customTransport.handleStreamResuming()
                                        ↓ (true)
                              _resumeResolver → ACK + ReadableStream → useChat pipeline
                                        ↓ (false)
                              localRequestIdsRef check → fallback (cross-tab)

Transport instability → true singleton

The transport is now a true singleton (useRef, created once, never recreated). transport.agent is updated every render to point at the latest socket. The resolver survives _pk changes because the transport instance never changes — both reconnectToStream (via ChatStore) and handleStreamResuming (via onAgentMessage) always operate on the same instance.

The agent property was changed from private to public, and reconnectToStream's resolver uses this.agent (not a captured local) so ACK and chunk listeners go through the latest socket.

Chat recreation → stable ID

Replaced id: agent._pk with id: initialMessagesCacheKey (based on URL + agent namespace + instance name). This identifier is stable across socket recreations, so the AI SDK's Chat is never recreated when _pk changes. The in-flight makeRequest survives and correctly transitions status to "streaming".

Double STREAM_RESUMING → localRequestIdsRef guard

Added localRequestIdsRef.current.has(data.id) check before the fallback path to prevent duplicate ACK/replay.

Edge cases handled

  • Double STREAM_RESUMINGlocalRequestIdsRef guard prevents duplicate ACK/replay
  • React strict modeuseRef singleton survives mount/unmount/remount; second reconnectToStream overwrites resolver; first times out harmlessly
  • _pk change (async queries) — transport singleton survives; agent ref updated; ACK/listeners use latest socket; stable Chat ID prevents Chat recreation
  • WS not open on refresh — PartySocket queues send(); works regardless of readyState
  • Connection drops mid-resume — PartySocket EventTarget survives reconnection
  • Timeout (no active stream) — resolves null after 5s, resolver cleared
  • Agent switches — transport's agent ref updated every render; old resolver orphaned

Testing

  • 23 transport unit tests (18 original + 5 new agent-swap tests) covering: resolver lifecycle, ACK, activeRequestIds tracking, timeout, strict mode double-call, chunk reception/completion/filtering, error handling, send failure tolerance, double-RESUMING dedup, agent property update, resolver surviving agent swap, ACK routing to new agent, chunk listener on new agent, full end-to-end agent swap
  • All 290 tests pass (23 new + 267 existing)
  • All 45 projects typecheck

@changeset-bot
Copy link

changeset-bot bot commented Feb 26, 2026

🦋 Changeset detected

Latest commit: ff7026b

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@cloudflare/ai-chat Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link

pkg-pr-new bot commented Feb 26, 2026

Open in StackBlitz

npm i https://pkg.pr.new/agents@999
npm i https://pkg.pr.new/@cloudflare/ai-chat@999
npm i https://pkg.pr.new/@cloudflare/codemode@999
npm i https://pkg.pr.new/hono-agents@999

commit: 807365f

Implement WebSocketChatTransport.reconnectToStream() to return a proper
ReadableStream for resumed streams, and forward the resume option to
useChat. This lets the AI SDK's pipeline process resumed chunks natively,
correctly managing status, isLoading, and abort during stream resumption.

- reconnectToStream sends RESUME_REQUEST, waits for RESUMING, sends ACK,
  returns ReadableStream fed by replayed + live chunks
- 100ms delayed explicit request avoids double-RESUMING race with onConnect
- onAgentMessage guards with localRequestIdsRef to skip transport-handled
  chunks
- Removed duplicate RESUME_REQUEST from useEffect (transport owns it now)
- Updated test to verify progressive chunk processing
@threepointone threepointone force-pushed the fix/resume-stream-status branch from 4b4d4b0 to ff7026b Compare February 28, 2026 15:30
@threepointone threepointone merged commit 95753da into main Feb 28, 2026
3 checks passed
@threepointone threepointone deleted the fix/resume-stream-status branch February 28, 2026 21:39
@github-actions github-actions bot mentioned this pull request Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant