MiniMax / minimax/minimax-m2.7-highspeed

MiniMax M2.7 Highspeed - access through LLMTR

MiniMax M2.7 Highspeed is for teams that want M2.7-style behavior with faster response latency. It is useful for coding help, refactoring, live debugging, and interactive agent experiences where turnaround time matters more, while keeping the same 204,800-token context window.

Technical specifications

Canonical ID	`minimax/minimax-m2.7-highspeed`
Provider	MiniMax
Context window	204,800 tokens
Operations	CHAT_COMPLETIONS
Modalities	text

Pricing

A 6% platform margin applies to credit top-ups; model usage prices are not separately marked up.

Operation	Metric	Unit	Price
CHAT_COMPLETIONS	CACHE_READ	PER_1M_TOKENS	$0.060000
CHAT_COMPLETIONS	CACHE_WRITE	PER_1M_TOKENS	$0.375000
CHAT_COMPLETIONS	INPUT_TEXT	PER_1M_TOKENS	$0.600000
CHAT_COMPLETIONS	OUTPUT_TEXT	PER_1M_TOKENS	$2.40

Example usage

With existing OpenAI SDK flows, change only the base URL and model identifier.

curl https://llmtr.com/v1/chat/completions \
  -H "Authorization: Bearer sk_your_key" \
  -H "Content-Type: application/json" \
  -d '{"model":"minimax/minimax-m2.7-highspeed","messages":[{"role":"user","content":"Hello"}]}'

Related models

MiniMax M2.7 minimax/minimax-m2.7
MiniMax M2.5 minimax/minimax-m2.5
MiniMax M2.5 Highspeed minimax/minimax-m2.5-highspeed
Gemma 4 llmtr/gemma-4
Qwen 3.6 35B-A3B llmtr/qwen3-6-35b
MedGemma 4B llmtr/medgemma-4b
Sincap llmtr/sincap
GPT-5.5 openai/gpt-5.5