Provider Comparison
Same model, different infra.
Gemma 4 31B served on 4 providers. The weights are identical, the serving stack decides speed, latency, and real-time readiness.
2.0×
faster on Overshoot
| # | Details | |||||||
|---|---|---|---|---|---|---|---|---|
| 1 | View Overshoot Overshoot | Overshoot | 270 tok/s | 52 ms | 141 ms | $0.27 | <200 ms | |
| 2 | View Together Together | Hosted API | 173 tok/s | 91 ms | 230 ms | $0.27 | – | |
| 3 | View Novita Novita | Hosted API | 150 tok/s | 98 ms | 259 ms | $0.21 | – | |
| 4 | View vLLM (self-host) vLLM (self-host) | Self-host vLLM | 137 tok/s | 106 ms | 281 ms | $0.17 | – |