Google / google/gemini-3.1-flash-lite

Gemini 3.1 Flash-Lite - access through LLMTR

Built for simple extraction, fast agent tasks, translation, and very frequent multimodal requests where latency and cost efficiency matter. Use this stable GA model ID instead of the retired preview identifier.

Technical specifications

Canonical ID	`google/gemini-3.1-flash-lite`
Provider	Google
Context window	1,048,576 tokens
Operations	CHAT_COMPLETIONS, RESPONSES
Modalities	text, image, video, audio

Pricing

A 6% platform margin applies to credit top-ups; model usage prices are not separately marked up.

Operation	Metric	Unit	Price
CHAT_COMPLETIONS	CACHE_STORAGE	PER_1M_TOKEN_HOURS	$1.00
CHAT_COMPLETIONS	CACHE_WRITE	PER_1M_TOKENS	$0.025000
CHAT_COMPLETIONS	CACHE_WRITE	PER_1M_TOKENS	$0.025000
CHAT_COMPLETIONS	CACHE_WRITE	PER_1M_TOKENS	$0.025000
CHAT_COMPLETIONS	CACHE_WRITE	PER_1M_TOKENS	$0.050000
CHAT_COMPLETIONS	INPUT_AUDIO	PER_1M_TOKENS	$0.500000
CHAT_COMPLETIONS	INPUT_IMAGE	PER_1M_TOKENS	$0.250000
CHAT_COMPLETIONS	INPUT_TEXT	PER_1M_TOKENS	$0.250000

Example usage

With existing OpenAI SDK flows, change only the base URL and model identifier.

curl https://llmtr.com/v1/chat/completions \
  -H "Authorization: Bearer sk_your_key" \
  -H "Content-Type: application/json" \
  -d '{"model":"google/gemini-3.1-flash-lite","messages":[{"role":"user","content":"Hello"}]}'

Related models

Gemini Deep Research Preview google/deep-research-pro-preview-12-2025
Gemini 2.5 Flash google/gemini-2.5-flash
Gemini 2.5 Flash Image google/gemini-2.5-flash-image
Gemini 2.5 Flash-Lite google/gemini-2.5-flash-lite
Gemini 2.5 Flash-Lite Preview google/gemini-2.5-flash-lite-preview-09-2025
Gemini 2.5 Flash Native Audio (Live API) google/gemini-2.5-flash-native-audio-preview-12-2025
Gemini 2.5 Flash Preview TTS google/gemini-2.5-flash-preview-tts
Gemini 2.5 Pro google/gemini-2.5-pro