NVIDIA / nvidia/nemotron-3-ultra-550b-a55b

Nemotron 3 Ultra 550B A55B - access through LLMTR

Nemotron 3 Ultra is NVIDIA's largest open-weight model (550B total, 55B active parameters). With a 1M-token context window it suits long-document analysis, multi-step reasoning, function calling, and agent workflows that need JSON output. It is offered free with a daily usage quota; once the quota is reached, requests are limited until the next day.

Technical specifications

Canonical IDnvidia/nemotron-3-ultra-550b-a55b
ProviderNVIDIA
Context window1,000,000 tokens
OperationsCHAT_COMPLETIONS
Modalitiestext

Pricing

A 6% platform margin applies to credit top-ups; model usage prices are not separately marked up.

OperationMetricUnitPrice
CHAT_COMPLETIONSINPUT_TEXTPER_1M_TOKENSNot available
CHAT_COMPLETIONSOUTPUT_TEXTPER_1M_TOKENSNot available

Example usage

With existing OpenAI SDK flows, change only the base URL and model identifier.

curl https://llmtr.com/v1/chat/completions \
  -H "Authorization: Bearer llmtr-your_key" \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/nemotron-3-ultra-550b-a55b","messages":[{"role":"user","content":"Hello"}]}'

Related models