MiniMax / minimax/minimax-m2.7-highspeed
MiniMax M2.7 Highspeed - access through LLMTR
MiniMax M2.7 Highspeed is for teams that want M2.7-style behavior with faster response latency. It is useful for coding help, refactoring, live debugging, and interactive agent experiences where turnaround time matters more, while keeping the same 204,800-token context window.
Technical specifications
| Canonical ID | minimax/minimax-m2.7-highspeed |
|---|---|
| Provider | MiniMax |
| Context window | 204,800 tokens |
| Operations | CHAT_COMPLETIONS |
| Modalities | text |
Pricing
A 6% platform margin applies to credit top-ups; model usage prices are not separately marked up.
| Operation | Metric | Unit | Price |
|---|---|---|---|
| CHAT_COMPLETIONS | CACHE_READ | PER_1M_TOKENS | $0.060000 |
| CHAT_COMPLETIONS | CACHE_WRITE | PER_1M_TOKENS | $0.375000 |
| CHAT_COMPLETIONS | INPUT_TEXT | PER_1M_TOKENS | $0.600000 |
| CHAT_COMPLETIONS | OUTPUT_TEXT | PER_1M_TOKENS | $2.40 |
Example usage
With existing OpenAI SDK flows, change only the base URL and model identifier.
curl https://llmtr.com/v1/chat/completions \
-H "Authorization: Bearer sk_your_key" \
-H "Content-Type: application/json" \
-d '{"model":"minimax/minimax-m2.7-highspeed","messages":[{"role":"user","content":"Hello"}]}'
Related models
- MiniMax M2.7 minimax/minimax-m2.7
- MiniMax M2.5 minimax/minimax-m2.5
- MiniMax M2.5 Highspeed minimax/minimax-m2.5-highspeed
- Gemma 4 llmtr/gemma-4
- Qwen 3.6 35B-A3B llmtr/qwen3-6-35b
- MedGemma 4B llmtr/medgemma-4b
- Sincap llmtr/sincap
- GPT-5.5 openai/gpt-5.5