|
| 1 | +--- |
| 2 | +layout: integration |
| 3 | +name: Elevenlabs |
| 4 | +description: ElevenLabs Text-to-Speech components for Haystack. |
| 5 | +authors: |
| 6 | + - name: Andy |
| 7 | + socials: |
| 8 | + github: andychert |
| 9 | + twitter: andychert |
| 10 | +pypi: https://pypi.org/project/elevenlabs-haystack/ |
| 11 | +repo: https://github.com/andychert/elevenlabs-haystack |
| 12 | +type: Model Provider |
| 13 | +report_issue: https://github.com/andychert/elevenlabs-haystack/issues |
| 14 | +logo: /logos/elevenlabs.png |
| 15 | +version: Haystack 2.0 |
| 16 | +toc: true |
| 17 | +--- |
| 18 | + |
| 19 | +### **Table of Contents** |
| 20 | +- [Overview](#overview) |
| 21 | +- [Installation](#installation) |
| 22 | +- [Usage](#usage) |
| 23 | +- [License](#license) |
| 24 | + |
| 25 | +## Overview |
| 26 | + |
| 27 | +This repository contains an integration of ElevenLabs' Text-to-Speech API with Haystack pipelines. This package allows you to convert text to speech using ElevenLabs' API and optionally save the generated audio to AWS S3. |
| 28 | + |
| 29 | +## Installation |
| 30 | + |
| 31 | +```bash |
| 32 | +pip install elevenlabs_haystack |
| 33 | +``` |
| 34 | + |
| 35 | +## Usage |
| 36 | + |
| 37 | +#### **ElevenLabs API Key** |
| 38 | + |
| 39 | +To access the ElevenLabs API, you need to create an account and obtain an API key. |
| 40 | + |
| 41 | +1. Go to the [ElevenLabs](https://elevenlabs.ai/) website and sign up for an account. |
| 42 | +2. Once logged in, navigate to the **Profile** section. |
| 43 | +3. In the **API** section, generate a new API key. |
| 44 | +4. Copy the API key. |
| 45 | + |
| 46 | +#### **AWS Credentials** |
| 47 | + |
| 48 | +To store generated audio files on AWS S3, you need AWS credentials (Access Key ID, Secret Access Key) and specify a region. |
| 49 | + |
| 50 | +1. If you don’t have an AWS account, sign up at [AWS](https://aws.amazon.com/). |
| 51 | +2. Create a new IAM user and assign the necessary permissions to allow the user to upload files to S3. The `AmazonS3FullAccess` policy is sufficient for this example. |
| 52 | +3. Once the IAM user is created, download or note the **AWS Access Key ID** and **Secret Access Key**. |
| 53 | +4. Identify the **AWS Region** where your S3 bucket resides (e.g., `us-east-1`). This information can be found in the AWS Management Console. |
| 54 | +5. Finally, create or identify the S3 bucket where the generated audio files will be saved. |
| 55 | + |
| 56 | +Create a `.env` file in the root directory with the following content (replace with your actual credentials): |
| 57 | + |
| 58 | +```bash |
| 59 | +ELEVENLABS_API_KEY=sk_your_elevenlabs_api_key_here |
| 60 | +AWS_ACCESS_KEY_ID=your_aws_access_key_id |
| 61 | +AWS_SECRET_ACCESS_KEY=your_aws_secret_access_key |
| 62 | +AWS_REGION_NAME=us-east-1 |
| 63 | +AWS_S3_BUCKET_NAME=your_s3_bucket_name |
| 64 | +``` |
| 65 | + |
| 66 | +These variables will be automatically loaded using `dotenv` and used to access ElevenLabs and AWS services securely. |
| 67 | + |
| 68 | +### Basic Text-to-Speech Example |
| 69 | + |
| 70 | +This example shows how to use the `ElevenLabsTextToSpeech` component to convert text to speech and save the generated audio file locally or in an AWS S3 bucket. It uses environment variables to access sensitive credentials. |
| 71 | + |
| 72 | +```python |
| 73 | +from haystack.utils import Secret |
| 74 | +from elevenlabs_haystack import ElevenLabsTextToSpeech |
| 75 | + |
| 76 | +# Initialize the ElevenLabsTextToSpeech component using environment variables for sensitive data |
| 77 | +tts = ElevenLabsTextToSpeech( |
| 78 | + elevenlabs_api_key=Secret.from_env_var("ELEVENLABS_API_KEY"), |
| 79 | + output_folder="audio_files", # Save the generated audio locally |
| 80 | + voice_id="Alice", # ElevenLabs voice ID |
| 81 | + aws_s3_bucket_name=Secret.from_env_var("AWS_S3_BUCKET_NAME"), # S3 bucket for optional upload |
| 82 | + aws_s3_output_folder="s3_files", # Save the generated audio to AWS S3 |
| 83 | + aws_access_key_id=Secret.from_env_var("AWS_ACCESS_KEY_ID"), |
| 84 | + aws_secret_access_key=Secret.from_env_var("AWS_SECRET_ACCESS_KEY"), |
| 85 | + aws_region_name=Secret.from_env_var("AWS_REGION_NAME"), # AWS region |
| 86 | + voice_settings={ |
| 87 | + "stability": 0.75, |
| 88 | + "similarity_boost": 0.75, |
| 89 | + "style": 0.5, |
| 90 | + "use_speaker_boost": True, # Optional voice settings |
| 91 | + }, |
| 92 | +) |
| 93 | + |
| 94 | +# Run the text-to-speech conversion |
| 95 | +result = tts.run("Hello, world!") |
| 96 | + |
| 97 | +# Print the result |
| 98 | +print(result) |
| 99 | + |
| 100 | +""" |
| 101 | +{ |
| 102 | + "id": "elevenlabs-id", |
| 103 | + "file_name": "audio_files/elevenlabs-id.mp3", |
| 104 | + "s3_file_name": "s3_files/elevenlabs-id.mp3", |
| 105 | + "s3_bucket_name": "test-bucket", |
| 106 | + "s3_presigned_url": "https://test-bucket.s3.amazonaws.com/s3_files/elevenlabs-id.mp3" |
| 107 | +} |
| 108 | +""" |
| 109 | +``` |
| 110 | + |
| 111 | +### Example Using Haystack Pipeline |
| 112 | + |
| 113 | +This example demonstrates how to integrate the `ElevenLabsTextToSpeech` component into a Haystack pipeline. Additionally, we define a `WelcomeTextGenerator` component that generates a personalized welcome message. |
| 114 | + |
| 115 | +```python |
| 116 | +from haystack import component, Pipeline |
| 117 | +from haystack.utils import Secret |
| 118 | +from elevenlabs_haystack import ElevenLabsTextToSpeech |
| 119 | + |
| 120 | +# Define a simple component to generate a welcome message |
| 121 | +@component |
| 122 | +class WelcomeTextGenerator: |
| 123 | + """ |
| 124 | + A component generating a personal welcome message and making it upper case. |
| 125 | + """ |
| 126 | + @component.output_types(welcome_text=str, note=str) |
| 127 | + def run(self, name: str): |
| 128 | + return { |
| 129 | + "welcome_text": f'Hello {name}, welcome to Haystack!'.upper(), |
| 130 | + "note": "welcome message is ready" |
| 131 | + } |
| 132 | + |
| 133 | +# Create a Pipeline |
| 134 | +text_pipeline = Pipeline() |
| 135 | + |
| 136 | +# Add WelcomeTextGenerator to the Pipeline |
| 137 | +text_pipeline.add_component( |
| 138 | + name="welcome_text_generator", |
| 139 | + instance=WelcomeTextGenerator() |
| 140 | +) |
| 141 | + |
| 142 | +# Add ElevenLabsTextToSpeech to the Pipeline using environment variables |
| 143 | +text_pipeline.add_component( |
| 144 | + name="tts", |
| 145 | + instance=ElevenLabsTextToSpeech( |
| 146 | + elevenlabs_api_key=Secret.from_env_var("ELEVENLABS_API_KEY"), |
| 147 | + output_folder="audio_files", # Save the generated audio locally |
| 148 | + voice_id="Alice", # ElevenLabs voice ID |
| 149 | + aws_s3_bucket_name=Secret.from_env_var("AWS_S3_BUCKET_NAME"), # S3 bucket for optional upload |
| 150 | + aws_s3_output_folder="s3_files", # Save the generated audio to AWS S3 |
| 151 | + aws_access_key_id=Secret.from_env_var("AWS_ACCESS_KEY_ID"), |
| 152 | + aws_secret_access_key=Secret.from_env_var("AWS_SECRET_ACCESS_KEY"), |
| 153 | + aws_region_name=Secret.from_env_var("AWS_REGION_NAME"), # Load region from env |
| 154 | + voice_settings={ |
| 155 | + "stability": 0.75, |
| 156 | + "similarity_boost": 0.75, |
| 157 | + "style": 0.5, |
| 158 | + "use_speaker_boost": True, # Optional voice settings |
| 159 | + }, |
| 160 | + ), |
| 161 | +) |
| 162 | + |
| 163 | +# Connect the output of WelcomeTextGenerator to the input of ElevenLabsTextToSpeech |
| 164 | +text_pipeline.connect(sender="welcome_text_generator.welcome_text", receiver="tts") |
| 165 | + |
| 166 | +# Run the pipeline with a sample name |
| 167 | +result = text_pipeline.run({"welcome_text_generator": {"name": "Bilge"}}) |
| 168 | + |
| 169 | +# Print the result |
| 170 | +print(result) |
| 171 | + |
| 172 | +""" |
| 173 | +{ |
| 174 | + "id": "elevenlabs-id", |
| 175 | + "file_name": "audio_files/elevenlabs-id.mp3", |
| 176 | + "s3_file_name": "s3_files/elevenlabs-id.mp3", |
| 177 | + "s3_bucket_name": "test-bucket", |
| 178 | + "s3_presigned_url": "https://test-bucket.s3.amazonaws.com/s3_files/elevenlabs-id.mp3" |
| 179 | +} |
| 180 | +""" |
| 181 | +``` |
| 182 | + |
| 183 | +# License |
| 184 | + |
| 185 | +This project is licensed under the MIT License. |
0 commit comments