Skip to content

OpenAI API

Hosted API
Models on live video
3
Regions measured
4
Real-time models
0 of 3
Median output speed
245 tok/s
Median E2E · US East
3.0 s
Median blended $/1M
$4.78

Models on OpenAI API

#Details
1View gpt-5-4-nanoGPT-5.4 nanoReasoningOpenAI
354 tok/s
48 ms1.7 s$3.20
2View gpt-5-4-miniGPT-5.4 miniReasoningOpenAI
245 tok/s
60 ms3.0 s$4.78
3View gpt-5-4GPT-5.4ReasoningOpenAI
91 tok/s
138 ms9.3 s$6.68
Output speed by model
Output speed by model. 3 bars. Highest: GPT-5.4 nano at 354 tok/s. Toggle the data table for exact values.GPT-5.4 nanoGPT-5.4 nano354 tok/sGPT-5.4 miniGPT-5.4 mini245 tok/sGPT-5.4GPT-5.491 tok/s

Reasoning models are indicated by a lightbulb icon

Real-time coverage

No model clears the 200 ms bar on OpenAI API yet, in any region we measure.

Latency columns use the US East (Virginia) reference region. Real-time marks a model that closes the loop under 200 ms end to end on OpenAI API from at least one region.