Skip to content

How to efficiently stream audio into whisper.cpp for real-time transcription? #3314

Answered by officiallyutso
d3r3k-d4nk asked this question in Q&A
Discussion options

You must be logged in to vote

Yeah Sure,

#include "whisper.h"
#include <vector>
#include <string>
#include <iostream>

// Function to load audio and split into chunks
std::vector<std::vector<float>> split_audio(const std::vector<float>& pcm, int chunk_samples, int overlap_samples = 0) {
    std::vector<std::vector<float>> chunks;
    int step = chunk_samples - overlap_samples;

    for (size_t start = 0; start < pcm.size(); start += step) {
        size_t end = std::min(start + chunk_samples, pcm.size());
        chunks.emplace_back(pcm.begin() + start, pcm.begin() + end);
    }

    return chunks;
}

int main() {
    // Load model
    struct whisper_context* ctx = whisper_init_from_file("models/ggml-base.en.bin");

    

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@d3r3k-d4nk
Comment options

@officiallyutso
Comment options

Answer selected by d3r3k-d4nk
@d3r3k-d4nk
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants