Multimodal Overview
For supported models, send messages[].content as an OpenAI-compatible content-part array instead of a plain string. Currently accepted parts:
text image_url input_audio
input_fileWhen should you use it?
Section titled “When should you use it?”Check a single model’s modalities via the catalog:
curl https://llmtr.com/api/models \ -H "Authorization: Bearer llmtr-your_key" \ | jq '.data[] | select(.canonicalId=="google/gemini-2.5-flash") | .modalities'To list every model that supports a specific operation (image generation, embeddings, TTS, …) use the operation filter:
# All image-generating modelscurl "https://llmtr.com/api/models?operation=IMAGES_GENERATIONS" \ -H "Authorization: Bearer llmtr-your_key"
# Embedding modelscurl "https://llmtr.com/api/models?operation=EMBEDDINGS" \ -H "Authorization: Bearer llmtr-your_key"
# Text-to-speech modelscurl "https://llmtr.com/api/models?operation=AUDIO_SPEECH" \ -H "Authorization: Bearer llmtr-your_key"When a model is sent to the wrong endpoint, the 400 unsupported_operation response includes error.details.supported_endpoints and error.details.suggested_endpoint to point you at the correct route. See Errors.
Content-part shape
Section titled “Content-part shape”{ "messages": [ { "role": "user", "content": [ { "type": "text", "text": "What's in this image?" }, { "type": "image_url", "image_url": { "url": "https://..." } } ] } ]}Limits and notes
Section titled “Limits and notes”- Media is sent through the JSON body. Remote URLs are safer than inline base64.
- Keep inline base64 audio clips short (< 1 MB recommended).
- For large files, PDFs, video, or reusable media use the Files API.