Multimodal Overview

For supported models, send messages[].content as an OpenAI-compatible content-part array instead of a plain string. Currently accepted parts:

text image_url input_audio

input_file

When should you use it?

Check a single model’s modalities via the catalog:

curl https://llmtr.com/api/models \
  -H "Authorization: Bearer llmtr-your_key" \
  | jq '.data[] | select(.canonicalId=="google/gemini-2.5-flash") | .modalities'

To list every model that supports a specific operation (image generation, embeddings, TTS, …) use the operation filter:

# All image-generating models
curl "https://llmtr.com/api/models?operation=IMAGES_GENERATIONS" \
  -H "Authorization: Bearer llmtr-your_key"

# Embedding models
curl "https://llmtr.com/api/models?operation=EMBEDDINGS" \
  -H "Authorization: Bearer llmtr-your_key"

# Text-to-speech models
curl "https://llmtr.com/api/models?operation=AUDIO_SPEECH" \
  -H "Authorization: Bearer llmtr-your_key"

When a model is sent to the wrong endpoint, the 400 unsupported_operation response includes error.details.supported_endpoints and error.details.suggested_endpoint to point you at the correct route. See Errors.

Content-part shape

{
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "What's in this image?" },
        {
          "type": "image_url",
          "image_url": { "url": "https://..." }
        }
      ]
    }
  ]
}

Limits and notes

Media is sent through the JSON body. Remote URLs are safer than inline base64.
Keep inline base64 audio clips short (< 1 MB recommended).
For large files, PDFs, video, or reusable media use the Files API.

Multimodal Overview

When should you use it?

Content-part shape

Limits and notes

Sub-pages