Skip to content

Google Vertex

Hosted API
Models on live video
2
Regions measured
5
Real-time models
1 of 2
Median output speed
176 tok/s
Median E2E · US East
3.9 s
Median blended $/1M
$5.17

Models on Google Vertex

#Details
1View gemini-3-flashGemini 3 FlashGoogle
247 tok/s
53 ms150 ms$4.52<200 ms
2View gemini-3-1-proGemini 3.1 ProReasoningGoogle
106 tok/s
125 ms7.6 s$5.83
Output speed by model
Output speed by model. 2 bars. Highest: Gemini 3 Flash at 247 tok/s. Toggle the data table for exact values.Gemini 3 FlashGemini 3 Flash247 tok/sGemini 3.1 ProGemini 3.1 Pro106 tok/s

Reasoning models are indicated by a lightbulb icon

Real-time coverage

US East (Virginia)1 of 2 under 200 ms
US West (Oregon)1 of 2 under 200 ms
EU West (Ireland)1 of 2 under 200 ms
Asia Pacific (Tokyo)1 of 2 under 200 ms
Asia Pacific (Mumbai)0 of 2 under 200 ms

Nothing under the bar here yet.

Latency columns use the US East (Virginia) reference region. Real-time marks a model that closes the loop under 200 ms end to end on Google Vertex from at least one region.