Skip to content

[Bug]: send_media function is inconsistent #628

@jjmaldonis

Description

@jjmaldonis

Summary

had to replace send_media with chunking + send_audio

What happened?

Below is a bit of code I'm running. I expected the send_media function to work well, but it is inconsistent. I had to replace it with a much more complex approach that chunks audio from the file and sends it via send_audio. See the commented out send_media call below and the following code.

async def transcribe_audio_flux(
    audio_file: str,
    sample_rate: int = 16000,
    chunk_size: int = 8192,
    wait_after_send: float = 2.0,
) -> Tuple[List[Segment]]:

    try:
        client = AsyncDeepgramClient(api_key=secret)
        collector = FluxCollector()    
        
        async with client.listen.v2.connect(
            model="flux-general-en",
            encoding="linear16",
            sample_rate=str(sample_rate),
            # eager_eot_threshold=0.6,
            # eot_threshold=0.6,
        ) as connection:
        
            def on_message(message: ListenV2SocketClientResponse) -> None:
                msg_type = getattr(message, "type", "Unknown")
                
                # Debug output
                # if msg_type == "Connected":
                #     print(f":white_check_mark: Deepgram Flux connected")
                
                # Collect turn information
                collector.on_message(message)
        
            connection.on(EventType.MESSAGE, on_message)
            connection.on(EventType.ERROR, lambda error: print(f"Error: {error}"))
        
            deepgram_task = asyncio.create_task(connection.start_listening())
    
            send_audio = getattr(connection, "send", None)
    
            audio = AudioSegment.from_file(audio_file)
            audio_bytes = audio.raw_data
            sample_rate = audio.frame_rate
            
            # await connection.send_media(audio_bytes)
            for index in range(0, len(audio_bytes), chunk_size):
                chunk = audio_bytes[index : index + chunk_size]
                if chunk:
                    if callable(send_audio):
                        await send_audio(chunk)
                    else:
                        await connection._send(chunk)
                    # Simulate real-time streaming delay
                    await asyncio.sleep(chunk_size / (sample_rate * 2))
    
            # Allow time for final transcripts to arrive
            await asyncio.sleep(wait_after_send)
    
            
            # Cancel the task
            if not deepgram_task.done():
                deepgram_task.cancel()
                try:
                    await deepgram_task
                except asyncio.CancelledError:
                    pass        
            
        segments = collector.finalize()
        return segments
    except Exception as e:
        return [str(e)]

Steps to reproduce

Run the code above

Minimal code sample

See above

Logs / traceback


Transport

WebSocket

API endpoint / path

flux

Model(s) used

flux

How often?

Often

Is this a regression?

  • Yes, it worked in an earlier version

Last working SDK version (if known)

No response

SDK version

?

Python version

?

Install method

None

OS

Linux (x86_64)

Environment details


Link to minimal repro (optional)

No response

Session ID (optional)

No response

Project ID (optional)

No response

Request ID (optional)

No response

Code of Conduct

  • I agree to follow this project’s Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions