OuteTTS - isVocoderEnable is false

Hi there, I am trying to run the OuteTTS model using llama.rn (0.6.0-rc.5)
Model I'm running - Llama-OuteTTS-1.0-1B-GGUF - quantized to Q4_K_M
I am unable to initialize the vocoder model - I am using the ibm-research/DAC.speech.v1.0 model (weights_24khz_1.5kbps_v1.0.pth) and have placed it in my assets folder

My RN Code:
```javascript
const VOCODER_ASSET = require('../assets/models/weights_24khz_1.5kbps_v1.0.pth');

const initializeTTSModel = useCallback(async (modelPath: string) => {
    if (context) {
      console.log("TTS model context already exists. Releasing old context first.");
      await context.release();
      setContext(null);
    }
    setStatus('loading_model');
    setError(null);
    try {
      console.log(`Initializing OuteTTS model from: ${modelPath}`);
      const ctx = await initLlama({ model: modelPath, n_ctx: 8192 });
      setContext(ctx);
      setStatus('model_loaded');
      console.log("OuteTTS model loaded successfully.");
    } catch (e) {
      const errorMessage = e instanceof Error ? e.message : String(e);
      setError(`Failed to load OuteTTS model: ${errorMessage}`);
      setStatus('error');
      console.error(errorMessage);
    }
  }, [context]);

  const initializeVocoder = useCallback(async () => {
    if (!context) {
      const err = "TTS model not initialized. Call initializeTTSModel first.";
      setError(err);
      setStatus('error');
      console.error(err);
      return;
    }

    setStatus('loading_vocoder');
    setError(null);
    try {
      // Get the vocoder asset and ensure it's available locally
      const vocoderAsset = Asset.fromModule(VOCODER_ASSET);
      console.log(vocoderAsset);
      if (!vocoderAsset.downloaded) {
        console.log("Vocoder asset not downloaded, downloading now...");
        await vocoderAsset.downloadAsync();
      }

      const vocoderModelPath = vocoderAsset.localUri;
      if (!vocoderModelPath) {
        throw new Error("Could not get local URI for vocoder model asset.");
      }
      
      console.log(`Initializing vocoder with model asset: ${vocoderModelPath}`);
      await context.initVocoder({ path: vocoderModelPath });

      const isEnabled = await context.isVocoderEnabled();
      if (isEnabled) {
        setStatus('ready');
        console.log("Vocoder initialized and TTS is ready.");
      } else {
        throw new Error("Vocoder initialization failed. isVocoderEnabled is false.");
      }
    } catch (e) {
      const errorMessage = e instanceof Error ? e.message : String(e);
      setError(`Failed to initialize vocoder: ${errorMessage}`);
      setStatus('error');
      console.error(errorMessage);
    }
  }, [context]);

const generateSpeech = useCallback(async (textToSpeak: string, speakerReferencePath: string) => {
    if (status !== 'ready' || !context) {
      const err = "TTS is not ready. Initialize model and vocoder first.";
      setError(err);
      console.error(err);
      return null;
    }

    setStatus('generating');
    setError(null);
    try {
      console.log(`Generating guide tokens from audio file: ${speakerReferencePath}`);
      const guide_tokens = await context.getAudioCompletionGuideTokens(speakerReferencePath);
      
      console.log(`Generating audio tokens for: "${textToSpeak}"`);
      const completionResult = await context.completion({
        prompt: textToSpeak,
        temperature: 0.4,
        top_k: 40,
        top_p: 0.9,
        min_p: 0.05,
        penalty_repeat: 1.1,
        guide_tokens: guide_tokens,
      });

      if (!completionResult?.audio_tokens) {
        throw new Error("Completion did not return any audio tokens.");
      }
      
      console.log('Decoding audio tokens into PCM data...');
      const audioData = await context.decodeAudioTokens(completionResult.audio_tokens);
      console.log(`Decoding complete. Received ${audioData.length} audio samples.`);
      
      setStatus('ready');
      return audioData;
    } catch (e) {
      const errorMessage = e instanceof Error ? e.message : String(e);
      setError(`An error occurred during speech generation: ${errorMessage}`);
      setStatus('error');
      console.error(errorMessage);
      return null;
    }
  }, [context, status]);
```

First I call the `initializeTTSModel` function, and these are the logs from that:
```
Initializing OuteTTS model from: file:///data/user/0/com.xxxxx.xxxxxxxxxxx/files/Llama-OuteTTS-1.0-1B-Q4_K_M.gguf
OuteTTS model loaded successfully.
```

Then I call the initializeVocoder function, and these are the logs:
```
{"_downloadCallbacks": [], "downloaded": false, "downloading": false, "hash": "26223f11df31fde72d8c828a7da50ca1", "height": null, "localUri": null, "name": "weights_24khz_1.5kbps_v1.0", "type": "pth", "uri": "http://localhost:8081/assets/?unstable_path=.%2Fassets%2Fmodels/weights_24khz_1.5kbps_v1.0.pth?platform=android&hash=26223f11df31fde72d8c828a7da50ca1", "width": null}
Vocoder asset not downloaded, downloading now...
Initializing vocoder with model asset: file:///data/user/0/com.xxxxx.xxxxxxxxxxx/cache/ExponentAsset-26223f11df31fde72d8c828a7da50ca1.pth
Vocoder initialization failed. isVocoderEnabled is false.
```

dependencies:
```javascript
"llama.rn": "0.6.0-rc.5",
"react-native": "0.79.2",
"react": "19.0.0",
"react-dom": "19.0.0",
```

Any help would be highly appreciated. I've already looked at the PR and the example for OuteTTS, but am unable to understand why `isVocoderEnabled` is `false`. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OuteTTS - isVocoderEnable is false #152

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OuteTTS - isVocoderEnable is false #152

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions