Text-to-Speech

Google TTS models are called via /v1/audio/speech. The response is raw audio bytes.

Example

curl https://llmtr.com/v1/audio/speech \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-2.5-flash-preview-tts",
    "input": "Read today\'s summary in a calm, clear voice.",
    "voice": "Kore"
  }' --output speech.wav

Parameters

Field	Description
`model`	Canonical TTS model ID
`input`	Text to read (UTF-8)
`voice`	Voice character (model-specific list)
`response_format`	`wav`, `mp3`, `opus`, `aac`, `flac`
`speed`	0.25 - 4.0 (optional)

Voice characters

Example voices for Gemini TTS: Kore, Puck, Charon, Aoede. Full list is on the model detail page.

Pricing

Billed per character or per second depending on the model. See the model card’s native unit.