Google / google/gemini-3.5-flash

Gemini 3.5 Flash - access through LLMTR

Built for production apps that need 1M context, a 65K output ceiling, tool use, PDFs, and media inputs without moving to a heavier Pro tier. It is the stable Flash path for teams migrating from Gemini 3 Flash Preview while keeping a strong balance of speed, quality, and cost.

Technical specifications

Canonical IDgoogle/gemini-3.5-flash
ProviderGoogle
Context window1,048,576 tokens
OperationsCHAT_COMPLETIONS, RESPONSES
Modalitiesimage, video, audio, text

Pricing

A 6% platform margin applies to credit top-ups; model usage prices are not separately marked up.

OperationMetricUnitPrice
CHAT_COMPLETIONSCACHE_STORAGEPER_1M_TOKEN_HOURS$1.00
CHAT_COMPLETIONSCACHE_WRITEPER_1M_TOKENS$0.150000
CHAT_COMPLETIONSINPUT_AUDIOPER_1M_TOKENS$1.50
CHAT_COMPLETIONSINPUT_IMAGEPER_1M_TOKENS$1.50
CHAT_COMPLETIONSINPUT_TEXTPER_1M_TOKENS$1.50
CHAT_COMPLETIONSINPUT_VIDEOPER_1M_TOKENS$1.50
CHAT_COMPLETIONSOUTPUT_TEXTPER_1M_TOKENS$9.00
CHAT_COMPLETIONSTOOL_CALLPER_1K_CALLS$14.00

Example usage

With existing OpenAI SDK flows, change only the base URL and model identifier.

curl https://llmtr.com/v1/chat/completions \
  -H "Authorization: Bearer sk_your_key" \
  -H "Content-Type: application/json" \
  -d '{"model":"google/gemini-3.5-flash","messages":[{"role":"user","content":"Hello"}]}'

Related models