Reasoning Effort
For reasoning-capable models (e.g. the GPT-5 Codex family) the gateway lets you set the reasoning effort level in two ways:
- Model slug suffix —
openai/gpt-5.1-codex:high - Body field —
{ "reasoning": { "effort": "high" } }
If both are provided, the body field wins.
Supported levels
Section titled “Supported levels”| Level | Suffix alias | Description |
|---|---|---|
minimal | :min | Reasoning almost disabled, fastest response |
low | :low | Low effort, fast |
medium | :medium, :med | Default, balanced |
high | :high | High effort, deeper analysis |
xhigh | :max, :xhigh | Highest (on supported models only) |
Supported levels vary by model. Sending an unsupported level returns 400 unsupported_capability.
Unknown suffixes (:turbo, :fast, etc.) return 400 invalid_request_error.
Slug suffix example
Section titled “Slug suffix example”curl https://llmtr.com/v1/responses \ -H "Authorization: Bearer sk_your_key" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.3-codex:max", "messages": [ {"role": "user", "content": "Make this algorithm O(n)"} ] }':max is an alias for xhigh.
Body field example
Section titled “Body field example”curl https://llmtr.com/v1/responses \ -H "Authorization: Bearer sk_your_key" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.1-codex", "reasoning": { "effort": "low", "summary": "concise" }, "messages": [ {"role": "user", "content": "Quick one-line answer"} ] }'summary is optional. Use it when you want a shorter or more detailed reasoning summary.
Python (OpenAI SDK)
Section titled “Python (OpenAI SDK)”from openai import OpenAI
client = OpenAI( base_url="https://llmtr.com/v1", api_key="sk_your_key",)
response = client.responses.create( model="openai/gpt-5.1-codex", input="Refactor: make this function pure", reasoning={"effort": "high"},)
print(response.output_text)JavaScript (OpenAI SDK)
Section titled “JavaScript (OpenAI SDK)”import OpenAI from "openai";
const client = new OpenAI({ baseURL: "https://llmtr.com/v1", apiKey: process.env.LLMTR_API_KEY,});
const response = await client.responses.create({ model: "openai/gpt-5.3-codex:max", input: "Explain the O(n) optimization",});
console.log(response.output_text);Billing impact
Section titled “Billing impact”Higher reasoning levels usually produce more output tokens and longer execution times. That can increase both cost and latency.
For more detail see the Responses API page.