GPT-5 lands but not everyone’s happy

After what was perhaps the most anticipated launch of the post ChatGPT era, we finally get to see GPT-5, OpenAI’s major new “platform-wide” upgrade. Now when most of the near 1 billion users of ChatGPT hit the chat screen, they see one option “5”, and a router quietly decides when to use the faster base model or to switch into longer “thinking” mode depending on the complexity of the request. This change brings model routing to the mainstream and removes the need for users to pick “smarter” or “smaller” models themselves, although this hasn’t gone down well with everyone. In a long and varied launch stream most notable for some iffy benchmark charts, CEO Sam Altman pitched the upgrade as “the best model in the world at coding and writing” and saying it now feels like talking to a “PhD-level expert”.

On the numbers, GPT-5 is a clear step up. There’s an extended 256k context window, and OpenAI says responses are about 45% less likely to contain factual errors than GPT-4o, and when the model is thinking, about 80% less likely than o3. It posts 94.6% on AIME 2025 without tools, 74.9% on SWE-bench Verified, 88% on Aider Polyglot, 84.2% on MMMU, and GPT-5 pro reaches 88.4% on GPQA without tools. Safety work includes “safe completions” that answer sensitive questions at a higher level rather than refusing outright. Microsoft is also rolling GPT-5 through Copilot, GitHub Copilot and Azure AI Foundry, which will help it reach enterprise workflows quickly.

OpenAI’s launch leaned hard on vibe-coding mini games, fun to watch but not a new AI skill. The difference seems to be polish. GPT-5 keeps track of assets, styles and game logic with fewer slips, and it follows light art direction without losing the brief. Most testers came away impressed by its attention to detail and a useful streak of creativity, even if the process felt familiar.

But the current “vibes” on X suggest this launch has not gone smoothly. In fact, social media reaction from many AI influencers has been very negative. Many developers say GPT-5 is stronger in coding, tool use and long multi-step tasks, and it feels more consistent than juggling 4o, 4.1 and o-series. But most hoped for a bigger jump. Reuters reported early reviewers were impressed but judged the leap from GPT-4 to GPT-5 smaller than past cycles. That frames GPT-5 as a strong upgrade that keeps OpenAI near the front of a fast pack that includes Gemini, Claude and Grok. But more frustrating has been the launch process itself. Many users had grown attached to legacy models and their “feel”. OpenAI removed all of them at a stroke (from the web interface if not the API) in the switch to 5, and for some it felt like walking into a favourite bar and finding the whole team replaced in one night, even if the replacements are more qualified. In a live Reddit AMA and subsequent X post, Sam Altman told users he understood the frustration and realised they had underestimated the affinity to older models, and said they are looking at options to keep 4o for certain users or for ways to better customise outputs. He also acknowledged an issue where “the auto switcher was out of commission” for part of the day, which likely fed early “it feels worse” reports. He added they will make it easier to manually trigger thinking and will “double rate limits for Plus” as rollout settles. Altman also owned the launch chart errors, calling it a “mega chart screwup”.

Pricing on the API side is highly competitive, with at $1.25 per million input tokens and $10 per million output, and mini and nano variants scaling down cost, plus new controls like a verbosity setting, a minimal reasoning mode, and new output controls. For most development teams, that is enough choice without bringing back a maze of model names.

Takeaways: GPT-5 is a platform release. Routing, controls, safer answers and solid benchmark gains matter more than a single headline score at this stage of the evolution of AI. Many users like the upgrade in coding and agent-style tasks, some miss the old models, and OpenAI will need to listen, fix the router rough edges and be clearer about what model is active. Competitively, this puts OpenAI back in stride, but not miles ahead. The pricing stack and Microsoft integrations should drive real adoption and utility across agentic AI. Expect the next few months to be about reliability, controls and agent workflows, not grand leaps toward AGI.

GPT-5 lands but not everyone’s happy

o3 and o4-mini prime agentic AI for take-off

Claude writes 4% of the world’s code

Alien tools with no manual

GPT-5.2 and the contours of progress

Subscribe to the ExoBrain Weekly Newsletter