overshootbenchmarks

Google Vertex

Hosted API

Models on live video: 2
Regions measured: 5
Real-time models: 1 of 2
Median output speed: 176 tok/s
Median E2E · US East: 3.9 s
Median blended $/1M: $5.17

Models on Google Vertex

#								Details
1	View gemini-3-flashGemini 3 Flash	Google	247 tok/s	53 ms	150 ms	$4.52	<200 ms
2	View gemini-3-1-proGemini 3.1 ProReasoning	Google	106 tok/s	125 ms	7.6 s	$5.83	–

Output speed by model

Reasoning models are indicated by a lightbulb icon

Real-time coverage

US East (Virginia)1 of 2 under 200 ms

US West (Oregon)1 of 2 under 200 ms

EU West (Ireland)1 of 2 under 200 ms

Asia Pacific (Tokyo)1 of 2 under 200 ms

Asia Pacific (Mumbai)0 of 2 under 200 ms

Nothing under the bar here yet.

Latency columns use the US East (Virginia) reference region. Real-time marks a model that closes the loop under 200 ms end to end on Google Vertex from at least one region.