Reasoning Effort

For reasoning-capable models (e.g. the GPT-5 Codex family) the gateway lets you set the reasoning effort level in two ways:

Model slug suffix — openai/gpt-5.1-codex:high
Body field — { "reasoning": { "effort": "high" } }

If both are provided, the body field wins.

Supported levels

Level	Suffix alias	Description
`minimal`	`:min`	Reasoning almost disabled, fastest response
`low`	`:low`	Low effort, fast
`medium`	`:medium`, `:med`	Default, balanced
`high`	`:high`	High effort, deeper analysis
`xhigh`	`:max`, `:xhigh`	Highest (on supported models only)

Supported levels vary by model. Sending an unsupported level returns 400 unsupported_capability.

Unknown suffixes (:turbo, :fast, etc.) return 400 invalid_request_error.

Slug suffix example

curl https://llmtr.com/v1/responses \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5.3-codex:max",
    "messages": [
      {"role": "user", "content": "Make this algorithm O(n)"}
    ]
  }'

:max is an alias for xhigh.

Body field example

curl https://llmtr.com/v1/responses \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5.1-codex",
    "reasoning": { "effort": "low", "summary": "concise" },
    "messages": [
      {"role": "user", "content": "Quick one-line answer"}
    ]
  }'

summary is optional. Use it when you want a shorter or more detailed reasoning summary.

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://llmtr.com/v1",
    api_key="llmtr-your_key",
)

response = client.responses.create(
    model="openai/gpt-5.1-codex",
    input="Refactor: make this function pure",
    reasoning={"effort": "high"},
)

print(response.output_text)

JavaScript (OpenAI SDK)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://llmtr.com/v1",
  apiKey: process.env.LLMTR_API_KEY,
});

const response = await client.responses.create({
  model: "openai/gpt-5.3-codex:max",
  input: "Explain the O(n) optimization",
});

console.log(response.output_text);

Billing impact

Higher reasoning levels usually produce more output tokens and longer execution times. That can increase both cost and latency.

For more detail see the Responses API page.