Skip to content

Together

Hosted API
Models on live video
3
Regions measured
3
Real-time models
2 of 3
Median output speed
261 tok/s
Median E2E · US East
150 ms
Median blended $/1M
$0.12

Models on Together

#Details
1View holo3-35b-a3bHolo3 35B A3BH Company
262 tok/s
58 ms150 ms$0.12<200 ms
2View gemma-4-26b-a4bGemma 4 26B A4BGoogle
261 tok/s
58 ms150 ms$0.12<200 ms
3View gemma-4-31bGemma 4 31BGoogle
173 tok/s
91 ms230 ms$0.27
Output speed by model
Output speed by model. 3 bars. Highest: Holo3 35B A3B at 262 tok/s. Toggle the data table for exact values.Holo3 35B A3BHolo3 35B A3B262 tok/sGemma 4 26B A4BGemma 4 26B A4B261 tok/sGemma 4 31BGemma 4 31B173 tok/s

Real-time coverage

US East (Virginia)2 of 3 under 200 ms
US West (Oregon)2 of 3 under 200 ms
EU West (Ireland)2 of 3 under 200 ms

Latency columns use the US East (Virginia) reference region. Real-time marks a model that closes the loop under 200 ms end to end on Together from at least one region.