NVIDIA / nvidia/nemotron-3-ultra-550b-a55b
Nemotron 3 Ultra 550B A55B - access through LLMTR
Nemotron 3 Ultra is NVIDIA's largest open-weight model (550B total, 55B active parameters). With a 1M-token context window it suits long-document analysis, multi-step reasoning, function calling, and agent workflows that need JSON output. It is offered free with a daily usage quota; once the quota is reached, requests are limited until the next day.
Technical specifications
| Canonical ID | nvidia/nemotron-3-ultra-550b-a55b |
|---|---|
| Provider | NVIDIA |
| Context window | 1,000,000 tokens |
| Operations | CHAT_COMPLETIONS |
| Modalities | text |
Pricing
A 6% platform margin applies to credit top-ups; model usage prices are not separately marked up.
| Operation | Metric | Unit | Price |
|---|---|---|---|
| CHAT_COMPLETIONS | INPUT_TEXT | PER_1M_TOKENS | Not available |
| CHAT_COMPLETIONS | OUTPUT_TEXT | PER_1M_TOKENS | Not available |
Example usage
With existing OpenAI SDK flows, change only the base URL and model identifier.
curl https://llmtr.com/v1/chat/completions \
-H "Authorization: Bearer llmtr-your_key" \
-H "Content-Type: application/json" \
-d '{"model":"nvidia/nemotron-3-ultra-550b-a55b","messages":[{"role":"user","content":"Hello"}]}'
Related models
- Nemotron 3 Super 120B A12B nvidia/nemotron-3-super-120b-a12b
- Nemotron 3 Nano 30B A3B nvidia/nemotron-3-nano-30b-a3b
- Nemotron Nano 12B V2 VL nvidia/nemotron-nano-12b-v2-vl
- Gemma 4 llmtr/gemma-4
- Qwen 3.6 35B-A3B llmtr/qwen3-6-35b
- MedGemma 4B llmtr/medgemma-4b
- Trendyol 7B llmtr/trendyol-7b
- Magibu 11B v8 llmtr/magibu-11b-v8