Skip to content

Multimodal Overview

For supported models, send messages[].content as an OpenAI-compatible content-part array instead of a plain string. Currently accepted parts:

text  image_url  input_audio 

input_file

Check a model’s modalities via the catalog:

Terminal window
curl https://llmtr.com/api/models \
-H "Authorization: Bearer sk_your_key" \
| jq '.data[] | select(.canonicalId=="google/gemini-2.5-flash") | .modalities'
{
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "What's in this image?" },
{
"type": "image_url",
"image_url": { "url": "https://..." }
}
]
}
]
}
  • Media is sent through the JSON body. Remote URLs are safer than inline base64.
  • Keep inline base64 audio clips short (< 1 MB recommended).
  • For large files, PDFs, video, or reusable media use the Files API.