xAI Grok Models

xAI models use canonical model IDs under xai/.... Text models use /v1/responses; image, video, and voice models use their dedicated gateway endpoints.

LLMTR sends store:false for xAI text models and rejects store:true, previous_response_id, and xAI server-side tools. xAI may retain API requests and responses for 30 days for audit/security purposes; LLMTR does not store prompt/response content in its own DB. When xAI returns usage.cost_in_usd_ticks for text/image/video, LLMTR settles from that exact provider-reported cost.

Text

Public text models:

Model	Endpoint	Notes
`xai/grok-4.5`	`/v1/responses`	Text+image input, function calling, structured outputs, and `low`/`medium`/`high` reasoning. Default effort is `high`; inputs above 200K tokens are rejected until high-context pricing is verified.
`xai/grok-4.3`	`/v1/responses`	Inputs estimated above 200K tokens are rejected until xAI high-context pricing is verified.
`xai/grok-4.20-multi-agent`	`/v1/responses`	`reasoning.effort` controls agent count; `high`/`xhigh` may cost more.
`xai/grok-4.20-0309-reasoning`	`/v1/responses`	Long-context reasoning model.
`xai/grok-4.20-0309-non-reasoning`	`/v1/responses`	Text work that does not need reasoning.

curl "$LLMTR_BASE_URL/v1/responses" \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "xai/grok-4.3",
    "input": "Simplify a TypeScript service function."
  }'

For Grok 4.5, only :low, :medium, and :high suffixes are supported. Without a suffix, LLMTR makes the provider default explicit and sends high. xAI does not support stop, presence_penalty, or frequency_penalty on this reasoning model, so the gateway rejects those fields with 400.

curl "$LLMTR_BASE_URL/v1/responses" \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "xai/grok-4.5:medium",
    "input": [
      {
        "role": "user",
        "content": [
          { "type": "input_text", "text": "Summarize the error flow in this screenshot." },
          { "type": "input_image", "image_url": "https://example.com/screenshot.png" }
        ]
      }
    ]
  }'

For the multi-agent model, use a suffix or body field:

curl "$LLMTR_BASE_URL/v1/responses" \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "xai/grok-4.20-multi-agent:high",
    "input": "Evaluate this architecture decision across cost, risk, and maintenance."
  }'

Image

xai/grok-imagine-image is cataloged at $0.02 per image, and xai/grok-imagine-image-quality at $0.04 per image. If xAI returns usage.cost_in_usd_ticks, that exact provider cost is used; otherwise LLMTR falls back to deterministic per-image pricing. For image edits, xAI bills both input images and generated output images, so fallback settlement uses input image count + output image count.

curl "$LLMTR_BASE_URL/v1/images/generations" \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "xai/grok-imagine-image",
    "prompt": "A clean product photo of a copper desk lamp",
    "aspect_ratio": "16:9",
    "resolution": "1k",
    "response_format": "url",
    "n": 1
  }'

Image edit:

curl "$LLMTR_BASE_URL/v1/images/edits" \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "xai/grok-imagine-image",
    "prompt": "Render this image as a pencil sketch.",
    "image": { "url": "https://example.com/source.jpg" },
    "aspect_ratio": "1:1",
    "response_format": "url"
  }'

For multi-image edits, send up to 5 inputs in images.

Video

xai/grok-imagine-video is priced at $0.05 per second. The gateway starts the async xAI video request and polls until the result is ready.

curl "$LLMTR_BASE_URL/v1/video/generations" \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "xai/grok-imagine-video",
    "prompt": "A calm time-lapse of clouds over a mountain ridge",
    "duration_seconds": 5,
    "aspect_ratio": "16:9",
    "resolution": "480p",
    "max_wait_seconds": 120
  }'

Voice

The TTS model is xai/grok-voice-tts; the REST STT model is xai/grok-voice-stt. Streaming STT and realtime voice are not public in this phase.

curl "$LLMTR_BASE_URL/v1/audio/speech" \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "xai/grok-voice-tts",
    "input": "Hello, this is a Grok Voice TTS example.",
    "voice": "eve",
    "language": "en"
  }' \
  --output voice.mp3

STT accepts base64 audio in the JSON body. Because billing is per hour of audio, sending duration_seconds for short files reduces the preflight hold.

curl "$LLMTR_BASE_URL/v1/audio/transcriptions" \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "xai/grok-voice-stt",
    "audio_base64": "<base64 mp3>",
    "audio_format": "mp3",
    "language": "en",
    "format": true,
    "duration_seconds": 12
  }'

Cost Safety

For text/image/video, LLMTR uses xAI cost_in_usd_ticks when it is returned.
xai/grok-4.5 is cataloged at official prices: input $2/1M, cached input $0.50/1M, and output $6/1M; settlement still uses xAI exact ticks.
Image and video fall back to official deterministic prices only when provider ticks are absent.
TTS is calculated from input characters; STT is calculated from provider response duration.
xAI server-side tools, files/collections, streaming STT, and realtime voice are not public.