Skip to content

OuteTTS - isVocoderEnable is false #152

@Prasad-178

Description

@Prasad-178

Hi there, I am trying to run the OuteTTS model using llama.rn (0.6.0-rc.5)
Model I'm running - Llama-OuteTTS-1.0-1B-GGUF - quantized to Q4_K_M
I am unable to initialize the vocoder model - I am using the ibm-research/DAC.speech.v1.0 model (weights_24khz_1.5kbps_v1.0.pth) and have placed it in my assets folder

My RN Code:

const VOCODER_ASSET = require('../assets/models/weights_24khz_1.5kbps_v1.0.pth');

const initializeTTSModel = useCallback(async (modelPath: string) => {
    if (context) {
      console.log("TTS model context already exists. Releasing old context first.");
      await context.release();
      setContext(null);
    }
    setStatus('loading_model');
    setError(null);
    try {
      console.log(`Initializing OuteTTS model from: ${modelPath}`);
      const ctx = await initLlama({ model: modelPath, n_ctx: 8192 });
      setContext(ctx);
      setStatus('model_loaded');
      console.log("OuteTTS model loaded successfully.");
    } catch (e) {
      const errorMessage = e instanceof Error ? e.message : String(e);
      setError(`Failed to load OuteTTS model: ${errorMessage}`);
      setStatus('error');
      console.error(errorMessage);
    }
  }, [context]);

  const initializeVocoder = useCallback(async () => {
    if (!context) {
      const err = "TTS model not initialized. Call initializeTTSModel first.";
      setError(err);
      setStatus('error');
      console.error(err);
      return;
    }

    setStatus('loading_vocoder');
    setError(null);
    try {
      // Get the vocoder asset and ensure it's available locally
      const vocoderAsset = Asset.fromModule(VOCODER_ASSET);
      console.log(vocoderAsset);
      if (!vocoderAsset.downloaded) {
        console.log("Vocoder asset not downloaded, downloading now...");
        await vocoderAsset.downloadAsync();
      }

      const vocoderModelPath = vocoderAsset.localUri;
      if (!vocoderModelPath) {
        throw new Error("Could not get local URI for vocoder model asset.");
      }
      
      console.log(`Initializing vocoder with model asset: ${vocoderModelPath}`);
      await context.initVocoder({ path: vocoderModelPath });

      const isEnabled = await context.isVocoderEnabled();
      if (isEnabled) {
        setStatus('ready');
        console.log("Vocoder initialized and TTS is ready.");
      } else {
        throw new Error("Vocoder initialization failed. isVocoderEnabled is false.");
      }
    } catch (e) {
      const errorMessage = e instanceof Error ? e.message : String(e);
      setError(`Failed to initialize vocoder: ${errorMessage}`);
      setStatus('error');
      console.error(errorMessage);
    }
  }, [context]);

const generateSpeech = useCallback(async (textToSpeak: string, speakerReferencePath: string) => {
    if (status !== 'ready' || !context) {
      const err = "TTS is not ready. Initialize model and vocoder first.";
      setError(err);
      console.error(err);
      return null;
    }

    setStatus('generating');
    setError(null);
    try {
      console.log(`Generating guide tokens from audio file: ${speakerReferencePath}`);
      const guide_tokens = await context.getAudioCompletionGuideTokens(speakerReferencePath);
      
      console.log(`Generating audio tokens for: "${textToSpeak}"`);
      const completionResult = await context.completion({
        prompt: textToSpeak,
        temperature: 0.4,
        top_k: 40,
        top_p: 0.9,
        min_p: 0.05,
        penalty_repeat: 1.1,
        guide_tokens: guide_tokens,
      });

      if (!completionResult?.audio_tokens) {
        throw new Error("Completion did not return any audio tokens.");
      }
      
      console.log('Decoding audio tokens into PCM data...');
      const audioData = await context.decodeAudioTokens(completionResult.audio_tokens);
      console.log(`Decoding complete. Received ${audioData.length} audio samples.`);
      
      setStatus('ready');
      return audioData;
    } catch (e) {
      const errorMessage = e instanceof Error ? e.message : String(e);
      setError(`An error occurred during speech generation: ${errorMessage}`);
      setStatus('error');
      console.error(errorMessage);
      return null;
    }
  }, [context, status]);

First I call the initializeTTSModel function, and these are the logs from that:

Initializing OuteTTS model from: file:///data/user/0/com.xxxxx.xxxxxxxxxxx/files/Llama-OuteTTS-1.0-1B-Q4_K_M.gguf
OuteTTS model loaded successfully.

Then I call the initializeVocoder function, and these are the logs:

{"_downloadCallbacks": [], "downloaded": false, "downloading": false, "hash": "26223f11df31fde72d8c828a7da50ca1", "height": null, "localUri": null, "name": "weights_24khz_1.5kbps_v1.0", "type": "pth", "uri": "http://localhost:8081/assets/?unstable_path=.%2Fassets%2Fmodels/weights_24khz_1.5kbps_v1.0.pth?platform=android&hash=26223f11df31fde72d8c828a7da50ca1", "width": null}
Vocoder asset not downloaded, downloading now...
Initializing vocoder with model asset: file:///data/user/0/com.xxxxx.xxxxxxxxxxx/cache/ExponentAsset-26223f11df31fde72d8c828a7da50ca1.pth
Vocoder initialization failed. isVocoderEnabled is false.

dependencies:

"llama.rn": "0.6.0-rc.5",
"react-native": "0.79.2",
"react": "19.0.0",
"react-dom": "19.0.0",

Any help would be highly appreciated. I've already looked at the PR and the example for OuteTTS, but am unable to understand why isVocoderEnabled is false.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions