-
Notifications
You must be signed in to change notification settings - Fork 90
Description
First: thanks for this amazing set of hooks and components. It has made implementing so much of this so clear and simple.
As I've experimented with this library I've noticed that the Transcriber is only showing a single message (because of the slice(1)). After I removed that I was seeing the transcription work pretty well. Unfortunately I started to see sections of the user transcript cut off (or missing). The LLM would respond to the text as if it existed, it just wasn't displayed.
I tracked this down to a race condition in handleDataChannelMessage in useWebRTCAudioSession. I think what is happening is the message update is happening on an interval that is too fast React's state update and messages are getting lost. This happens most frequently when the user pauses and the assistant attempts to interrupt. I suspect:
setConversation((prev) => {
const lastMsg = prev[prev.length - 1];
if (lastMsg && lastMsg.role === "assistant" && !lastMsg.isFinal) {
// Append to existing assistant partial
const updated = [...prev];
updated[updated.length - 1] = {
...lastMsg,
text: lastMsg.text + msg.delta,
};
return updated;
} else {
// Start a new assistant partial
return [...prev, newMessage];
}
});For now I have worked around this by keeping a second transcript variable. I set this up as a ref:
// Transcript state
const transcript = useRef<string>("");
Then in conversation.item.input_audio_transcription.completed and response.audio_transcript.done I just do:
const speaker = "You"; // or AI
const text = msg.transcript.trim();
transcript.current += `\n${speaker}: ${text}`;Using a ref seems to work around the problem well. Similarly, if I make a messages ref. I see all of the messages.
I suspect that I need to do more here... for example, I think I need to actually maintain an ordered hash of messages (by id) to prevent duplicate sends.
I am curious if you would be interested in a PR?