xAI Grok Models
xAI models use canonical model IDs under xai/.... Text models use /v1/responses; image, video, and voice models use their dedicated gateway endpoints.
LLMTR sends store:false for xAI text models and rejects store:true, previous_response_id, and xAI server-side tools. When xAI returns usage.cost_in_usd_ticks for text/image/video, LLMTR settles from that exact provider-reported cost.
Public text models:
| Model | Endpoint | Notes |
|---|---|---|
xai/grok-4.3 | /v1/responses | Inputs estimated above 200K tokens are rejected until xAI high-context pricing is verified. |
xai/grok-4.20-multi-agent | /v1/responses | reasoning.effort controls agent count; high/xhigh may cost more. |
xai/grok-4.20-0309-reasoning | /v1/responses | Long-context reasoning model. |
xai/grok-4.20-0309-non-reasoning | /v1/responses | Text work that does not need reasoning. |
curl "$LLMTR_BASE_URL/v1/responses" \ -H "Authorization: Bearer llmtr-your_key" \ -H "Content-Type: application/json" \ -d '{ "model": "xai/grok-4.3", "input": "Simplify a TypeScript service function." }'For the multi-agent model, use a suffix or body field:
curl "$LLMTR_BASE_URL/v1/responses" \ -H "Authorization: Bearer llmtr-your_key" \ -H "Content-Type: application/json" \ -d '{ "model": "xai/grok-4.20-multi-agent:high", "input": "Evaluate this architecture decision across cost, risk, and maintenance." }'xai/grok-imagine-image is cataloged at $0.02 per image, and xai/grok-imagine-image-quality at $0.04 per image. If xAI returns usage.cost_in_usd_ticks, that exact provider cost is used; otherwise LLMTR falls back to deterministic per-image pricing. For image edits, xAI bills both input images and generated output images, so fallback settlement uses input image count + output image count.
curl "$LLMTR_BASE_URL/v1/images/generations" \ -H "Authorization: Bearer llmtr-your_key" \ -H "Content-Type: application/json" \ -d '{ "model": "xai/grok-imagine-image", "prompt": "A clean product photo of a copper desk lamp", "aspect_ratio": "16:9", "resolution": "1k", "response_format": "url", "n": 1 }'Image edit:
curl "$LLMTR_BASE_URL/v1/images/edits" \ -H "Authorization: Bearer llmtr-your_key" \ -H "Content-Type: application/json" \ -d '{ "model": "xai/grok-imagine-image", "prompt": "Render this image as a pencil sketch.", "image": { "url": "https://example.com/source.jpg" }, "aspect_ratio": "1:1", "response_format": "url" }'For multi-image edits, send up to 5 inputs in images.
xai/grok-imagine-video is priced at $0.05 per second. The gateway starts the async xAI video request and polls until the result is ready.
curl "$LLMTR_BASE_URL/v1/video/generations" \ -H "Authorization: Bearer llmtr-your_key" \ -H "Content-Type: application/json" \ -d '{ "model": "xai/grok-imagine-video", "prompt": "A calm time-lapse of clouds over a mountain ridge", "duration_seconds": 5, "aspect_ratio": "16:9", "resolution": "480p", "max_wait_seconds": 120 }'The TTS model is xai/grok-voice-tts; the REST STT model is xai/grok-voice-stt. Streaming STT and realtime voice are not public in this phase.
curl "$LLMTR_BASE_URL/v1/audio/speech" \ -H "Authorization: Bearer llmtr-your_key" \ -H "Content-Type: application/json" \ -d '{ "model": "xai/grok-voice-tts", "input": "Hello, this is a Grok Voice TTS example.", "voice": "eve", "language": "en" }' \ --output voice.mp3STT accepts base64 audio in the JSON body. Because billing is per hour of audio, sending duration_seconds for short files reduces the preflight hold.
curl "$LLMTR_BASE_URL/v1/audio/transcriptions" \ -H "Authorization: Bearer llmtr-your_key" \ -H "Content-Type: application/json" \ -d '{ "model": "xai/grok-voice-stt", "audio_base64": "<base64 mp3>", "audio_format": "mp3", "language": "en", "format": true, "duration_seconds": 12 }'Cost Safety
Section titled “Cost Safety”- For text/image/video, LLMTR uses xAI
cost_in_usd_tickswhen it is returned. - Image and video fall back to official deterministic prices only when provider ticks are absent.
- TTS is calculated from input characters; STT is calculated from provider response
duration. - xAI server-side tools, files/collections, streaming STT, and realtime voice are not public.