Google / google/gemini-3.1-flash-lite
Gemini 3.1 Flash-Lite - access through LLMTR
Built for simple extraction, fast agent tasks, translation, and very frequent multimodal requests where latency and cost efficiency matter. Use this stable GA model ID instead of the retired preview identifier.
Technical specifications
| Canonical ID | google/gemini-3.1-flash-lite |
|---|---|
| Provider | |
| Context window | 1,048,576 tokens |
| Operations | CHAT_COMPLETIONS, RESPONSES |
| Modalities | text, image, video, audio |
Pricing
A 6% platform margin applies to credit top-ups; model usage prices are not separately marked up.
| Operation | Metric | Unit | Price |
|---|---|---|---|
| CHAT_COMPLETIONS | CACHE_STORAGE | PER_1M_TOKEN_HOURS | $1.00 |
| CHAT_COMPLETIONS | CACHE_WRITE | PER_1M_TOKENS | $0.025000 |
| CHAT_COMPLETIONS | CACHE_WRITE | PER_1M_TOKENS | $0.025000 |
| CHAT_COMPLETIONS | CACHE_WRITE | PER_1M_TOKENS | $0.025000 |
| CHAT_COMPLETIONS | CACHE_WRITE | PER_1M_TOKENS | $0.050000 |
| CHAT_COMPLETIONS | INPUT_AUDIO | PER_1M_TOKENS | $0.500000 |
| CHAT_COMPLETIONS | INPUT_IMAGE | PER_1M_TOKENS | $0.250000 |
| CHAT_COMPLETIONS | INPUT_TEXT | PER_1M_TOKENS | $0.250000 |
Example usage
With existing OpenAI SDK flows, change only the base URL and model identifier.
curl https://llmtr.com/v1/chat/completions \
-H "Authorization: Bearer sk_your_key" \
-H "Content-Type: application/json" \
-d '{"model":"google/gemini-3.1-flash-lite","messages":[{"role":"user","content":"Hello"}]}'
Related models
- Gemini Deep Research Preview google/deep-research-pro-preview-12-2025
- Gemini 2.5 Flash google/gemini-2.5-flash
- Gemini 2.5 Flash Image google/gemini-2.5-flash-image
- Gemini 2.5 Flash-Lite google/gemini-2.5-flash-lite
- Gemini 2.5 Flash-Lite Preview google/gemini-2.5-flash-lite-preview-09-2025
- Gemini 2.5 Flash Native Audio (Live API) google/gemini-2.5-flash-native-audio-preview-12-2025
- Gemini 2.5 Flash Preview TTS google/gemini-2.5-flash-preview-tts
- Gemini 2.5 Pro google/gemini-2.5-pro