Google / google/gemini-3.1-flash-lite

Gemini 3.1 Flash-Lite - access through LLMTR

Built for simple extraction, fast agent tasks, translation, and very frequent multimodal requests where latency and cost efficiency matter. Use this stable GA model ID instead of the retired preview identifier.

Technical specifications

Canonical IDgoogle/gemini-3.1-flash-lite
ProviderGoogle
Context window1,048,576 tokens
OperationsCHAT_COMPLETIONS, RESPONSES
Modalitiestext, image, video, audio

Pricing

A 6% platform margin applies to credit top-ups; model usage prices are not separately marked up.

OperationMetricUnitPrice
CHAT_COMPLETIONSCACHE_STORAGEPER_1M_TOKEN_HOURS$1.00
CHAT_COMPLETIONSCACHE_WRITEPER_1M_TOKENS$0.025000
CHAT_COMPLETIONSCACHE_WRITEPER_1M_TOKENS$0.025000
CHAT_COMPLETIONSCACHE_WRITEPER_1M_TOKENS$0.025000
CHAT_COMPLETIONSCACHE_WRITEPER_1M_TOKENS$0.050000
CHAT_COMPLETIONSINPUT_AUDIOPER_1M_TOKENS$0.500000
CHAT_COMPLETIONSINPUT_IMAGEPER_1M_TOKENS$0.250000
CHAT_COMPLETIONSINPUT_TEXTPER_1M_TOKENS$0.250000

Example usage

With existing OpenAI SDK flows, change only the base URL and model identifier.

curl https://llmtr.com/v1/chat/completions \
  -H "Authorization: Bearer sk_your_key" \
  -H "Content-Type: application/json" \
  -d '{"model":"google/gemini-3.1-flash-lite","messages":[{"role":"user","content":"Hello"}]}'

Related models