vLLM (self-host)
Self-host vLLM- US East (Virginia)281 ms, view the regional leaderboard
- EU West (Ireland)291 ms, view the regional leaderboard
- Models on live video
- 5
- Regions measured
- 2
- Real-time models
- 2 of 5
- Median output speed
- 212 tok/s
- Median E2E · US East
- 281 ms
- Median blended $/1M
- $0.08
Models on vLLM (self-host)
| # | Details | |||||||
|---|---|---|---|---|---|---|---|---|
| 1 | View gemma-4-26b-a4bGemma 4 26B A4B | 240 tok/s | 66 ms | 166 ms | $0.08 | <200 ms | ||
| 2 | View qwen3-6-35b-a3bQwen3.6 35B A3BReasoning | Alibaba (Qwen) | 219 tok/s | 66 ms | 2.7 s | $0.08 | – | |
| 3 | View holo3-35b-a3bHolo3 35B A3B | H Company | 212 tok/s | 74 ms | 188 ms | $0.08 | <200 ms | |
| 4 | View gemma-4-31bGemma 4 31B | 137 tok/s | 106 ms | 281 ms | $0.17 | – | ||
| 5 | View qwen3-6-27bQwen3.6 27B | Alibaba (Qwen) | 136 tok/s | 105 ms | 281 ms | $0.17 | – |
Reasoning models are indicated by a lightbulb icon
Real-time coverage
US East (Virginia)2 of 5 under 200 ms
EU West (Ireland)2 of 5 under 200 ms
Latency columns use the US East (Virginia) reference region. Real-time marks a model that closes the loop under 200 ms end to end on vLLM (self-host) from at least one region.