overshootbenchmarks

Compare models

Up to four models side by side, measured the way they run on live video.

Compare models

Up to four models side by side, measured the way they run on live video.

Side-by-side comparison of the selected models across intelligence, capability, speed, latency, cost, context, and licensing. The best value in each numeric row is marked.
2 of 4 models	Claude Opus 4.6Anthropic	GPT-5.4OpenAI
Intelligence Index	best in row,86	85
OCR & Text	95	95
Document & Chart	best in row,87	85
Scene & Spatial	84	best in row,86
Video QA	best in row,86	83
Grounding & Detection	76	76
Structured Extraction	84	best in row,85
Output Speed	55 tok/s-36 tok/s	best in row,91 tok/s
TTFT	207 ms+70 ms	best in row,138 ms
End-to-End	12 s+3.0 s	best in row,9.3 s
Real-time (<200 ms)	No	No
Blended $/1M	$6.93+$0.25	best in row,$6.68
Cost / Task	$5.11	best in row,$4.93
Context	400k	400k
Params	–	–
Open?	Closed	Closed
License	Proprietary	Proprietary

See these on the landscape charts

Capability profile

Claude Opus 4.6
GPT-5.4