The news, 365 days behind — on purpose Delayed live · replaying 2025

One Year Ago.AI

Remember how fast this is.

27FEB2025replayed
one year on
model launchOpenAI

OpenAI previews GPT-4.5 Orion, its largest AI model yet

The model shows diminishing returns from scaled pre-training, with pricing at $75/M input tokens and OpenAI’s own benchmark comparisons raising questions about the future of the scaling paradigm.

OpenAI announced it is launching GPT-4.5, codenamed Orion, on Thursday — its largest model trained using more computing power and data than any of the company’s previous releases. Available as a research preview to ChatGPT Pro subscribers and API developers, the model arrives amid intense debate about whether traditional pre-training scaling still yields meaningful gains.

OpenAI says GPT-4.5 may not take the AI benchmark crown on its own. Internal testing shows GPT-4.5 trails reasoning models like DeepSeek R1, Anthropic’s Claude 3.7 Sonnet, and OpenAI’s own o3-mini on math and coding benchmarks. It performs better on factual accuracy and hallucination reduction, and OpenAI highlights its ‘warmer, more natural’ writing style and emotional intelligence.

Pricing has drawn attention: $75 per million input tokens and $150 per million output tokens — 30 times GPT-4o’s input-token price and 15 times its output-token price. OpenAI acknowledges it is ‘evaluating whether to continue serving GPT-4.5 in its API in the long term.’ The company’s white paper originally stated GPT-4.5 ‘is not a frontier AI model,’ though that line was removed hours after release.

The model’s limited performance leap intensifies a growing industry conversation: scaling up data and compute may no longer deliver the dramatic improvements of earlier generations. OpenAI plans to merge its GPT and reasoning model series into GPT-5 later this year.

I
Ilya Sutskever

Previously said pre-training as we know it will end, echoed in industry concerns about scaling laws plateauing

One year later — open only if you can handle spoilers

GPT-4.5 remained a niche, expensive model with limited adoption. The scaling plateau discussion it crystallized accelerated the industry's pivot to reasoning and test-time compute scaling, which dominated 2025's second half.

Replay thisPost on XRedditHNLinkedIn