Skip to content

Text-to-Speech

Google TTS models are called via /v1/audio/speech. The response is raw audio bytes.

Terminal window
curl https://llmtr.com/v1/audio/speech \
-H "Authorization: Bearer sk_your_key" \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemini-2.5-flash-preview-tts",
"input": "Read today\'s summary in a calm, clear voice.",
"voice": "Kore"
}' --output speech.wav
FieldDescription
modelCanonical TTS model ID
inputText to read (UTF-8)
voiceVoice character (model-specific list)
response_formatwav, mp3, opus, aac, flac
speed0.25 - 4.0 (optional)

Example voices for Gemini TTS: Kore, Puck, Charon, Aoede. Full list is on the model detail page.

Billed per character or per second depending on the model. See the model card’s native unit.