Skip to content

Commit 52ddc5c

Browse files
committed
feat: ✨ Added Defaults and costom language and speaker to endpoint
1 parent 81526f3 commit 52ddc5c

File tree

3 files changed

+56
-19
lines changed

3 files changed

+56
-19
lines changed

README.md

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,40 @@
11
# MeloTTS API Server
2-
A quick easy way to access [MeloTTS](https://github.com/myshell-ai/MeloTTS) through REST API calls.
32

4-
Currently only locked to english with american accent. Easy fix if requested, or you can just change the hardcode speaker_ids before build if needed.
3+
A quick easy way to access [MeloTTS](https://github.com/myshell-ai/MeloTTS) through REST API calls.
54

6-
## Usage
75
Assuming you have docker installed and setup
6+
87
### Build
9-
git clone git@github.com:timhagel/MeloTTS-API-Server.git
8+
9+
git clone git@github.com:timhagel/melotts-api-server.git
1010
cd melotts-api-server
1111
docker build -t melotts-api-server .
12-
### Run
13-
docker run -p 8888:8080 melotts-api-server
12+
13+
### Run (English)
14+
15+
docker run -p 8888:8080 -e DEFAULT_SPEED=1 -e DEFAULT_LANGUAGE=EN -e DEFAULT_SPEAKER_ID=EN-US melotts-api-server
16+
1417
### Call API
15-
**localhost:8888/text_to_speech**
18+
19+
**localhost:8888/convert/tts**
20+
21+
##### Use Environment Defaults
1622

1723
{
1824
"text": "Put input here"
1925
}
20-
Response : en-us.wav
2126

22-
### Acknowledgement
27+
Response : .wav
28+
29+
##### Customize (Everything except for "text" is optional)
30+
31+
{
32+
"text": "input",
33+
"speed": "speed",
34+
"language": "language",
35+
"speaker_id": "speaker_id"
36+
}
37+
38+
## Acknowledgement
39+
2340
This just a API server for the awesome work of [MeloTTS](https://github.com/myshell-ai/MeloTTS) from [MyShell](https://github.com/myshell-ai)

app.py

Lines changed: 28 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,46 @@
1-
from fastapi import FastAPI, Body
1+
import os
2+
import uvicorn
3+
from fastapi import FastAPI, Body, Depends
24
from pydantic import BaseModel
35
from fastapi.responses import FileResponse
46
from melo.api import TTS
7+
from dotenv import load_dotenv
8+
import tempfile
59

6-
speed = 1.0
10+
load_dotenv()
11+
DEFAULT_SPEED = float(os.getenv('DEFAULT_SPEED'))
12+
DEFAULT_LANGUAGE = os.getenv('DEFAULT_LANGUAGE')
13+
DEFAULT_SPEAKER_ID = os.getenv('DEFAULT_SPEAKER_ID')
714
device = 'auto' # Will automatically use GPU if available
815

916
class TextModel(BaseModel):
1017
text: str
18+
speed: float = DEFAULT_SPEED
19+
language: str = DEFAULT_LANGUAGE
20+
speaker_id: str = DEFAULT_SPEAKER_ID
1121

1222
app = FastAPI()
1323

14-
@app.post("/text_to_speech")
15-
async def create_upload_file(body: TextModel = Body(...)):
16-
model = TTS(language='EN', device=device)
24+
def get_tts_model(body: TextModel):
25+
return TTS(language=body.language, device=device)
26+
27+
@app.post("/convert/tts")
28+
async def create_upload_file(body: TextModel = Body(...), model: TTS = Depends(get_tts_model)):
1729
speaker_ids = model.hps.data.spk2id
1830

19-
output_path = 'en-us.wav'
20-
model.tts_to_file(body.text, speaker_ids['EN-US'], output_path, speed=speed)
31+
# Create a temporary file
32+
output_path = body.language + "_" + body.speaker_id + ".wav"
33+
model.tts_to_file(body.text, speaker_ids[body.speaker_id], output_path, speed=body.speed)
34+
35+
# Create a temporary file
36+
output_path = body.language + "_" + body.speaker_id + ".wav"
37+
model.tts_to_file(body.text, speaker_ids[body.speaker_id], output_path, speed=body.speed)
2138

39+
print(os.path.basename(output_path))
2240
# Return the audio file
23-
return FileResponse("en-us.wav", media_type="audio/mpeg", filename="en-us.wav")
41+
response = FileResponse(output_path, media_type="audio/mpeg", filename=os.path.basename(output_path))
42+
43+
return response
2444

2545
if __name__ == "__main__":
26-
import uvicorn
2746
uvicorn.run(app, host="0.0.0.0", port=8080)

requirements.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
1-
fastapi[all] == 0.110.0
1+
fastapi[all] == 0.110.0
2+
python-dotenv == 1.0.0

0 commit comments

Comments
 (0)