In Defense of ChatGPT-5: The Great Unlock

In Defense of ChatGPT-5: The Great Unlock

Although it seems in vogue over the last week to hate on ChatGPT-5, I actually think it's a great update. Hear me out. 

After almost three years of ChatGPT being in the wild, less than one out of five professionals rely on it in their workflows. (Based on research for my next book.)

OpenAI's latest update addresses two key issues for users, especially those just coming around to generative AI:

1. Lack of reliability (hallucinations)

2. Poor return on attention (wasted time)

Hallucinations:

From OpenAI's system card: "GPT‑5’s responses are ~45% less likely to contain a factual error than GPT‑4o, and when thinking, GPT‑5’s responses are ~80% less likely to contain a factual error than OpenAI o3...internal evals of real-world user prompts, GPT-5 in “thinking” mode was found to respond with incorrect information only 4.8% of the time – a steep drop from the 20–22% hallucination rate measured for GPT-4 (GPT-4o) and the o3 model."


Article content

With research being a key use case for novice to intermediate users, THIS IS A BIG DEAL. The more we can trust ChatGPT's outputs, the more productive we can be using it, especially compared to swimming through Blue Link Hell on Google.

The most striking example of accuracy's value lies in the medical assistant use case. As I've been recovering from surgery over the last few weeks, ChatGPT's medical assistant has been a real source of agency for me. I get instant answers when I can't talk to my doctor and I don't have to read that "I'm going to die" on Reddit. 

"On a medical Q&A test (HealthBench Hard), GPT-5 (with reasoning) hallucinated just 1.6% of the time, whereas GPT-4o and o3 had false info rates of 12.9% and 15.8% respectively."

Whoa, most of my doctors (ahem, physician's assistants) can't boast that accuracy! Especially with them being overloaded with too many patients. 

Every great innovation hits the tipping point with a few killer app use cases. With the internet, it was email and search. For this new era, medical advice on tap may be a springboard into mainstream adoption.

Most importantly, ChatGPT5 is more likely to tell you when it's either uncertain or it does not have an answer. That's a breakthrough when it comes to trust.

Return on attention

One of the limiters on mainstream adoption has been our sense of personal productivity as we use ChatGPT. We walk away empty handed after an hour OR we lose precious time waiting on the model to think it's way through a task. 

Some of us,have been stuck in a rut, using the base model 4o for everything. It's not necessarily our fault that we aren't using the right model for the task. 

As Wharton professor Ethan Mollick points out, “OpenAI previously made the default ChatGPT use fast, dumb models, hiding the good stuff from most users… people have never seen what AI can do because they’re stuck on GPT-4o and don’t know which of the confusingly-named models are better.”

This is especially true for anyone who's tried to vibe code or produce slide ready charts using 4o. This means that users walked away from prompting sessions with less useful results. 

Others got stuck using full reasoning models for simple requests. As a result, every prompt required 5 to 10 minutes to get an answer. It's easy when you're on o3 to let an hour slip by without knowing it. But these users believed that reasoning models were always better, so as a result, that just became their default.

ChatGPT5's model router fixes this on both sides of the equation. Under the hood, ChatGPT-5 is not a single monolithic model, but rather a “unified system." As Ethan Mollick points out: "GPT-5 is not one model as much as it is a switch that selects among multiple GPT-5 models of various sizes and abilities."

From OpenAI's documentation: "This upgrade is aimed at giving users the best of both worlds – quick answers for simple questions and deeper, more “thinking” answers for complex tasks – all without the user needing to toggle or guess which model to use."

Here's how it works: GPT-5 analyzes the query’s complexity and context. For an easy or well-defined question it will likely respond with the faster, lighter model, whereas a more complex or ambiguous problem will trigger the system to engage the heavier “reasoning” model that can “think” longer and more rigorously. This decision also takes into account factors like conversation history, tool use (e.g. if code execution or web browsing is needed), and even explicit user instructions – for example, including a phrase like “think hard about this” will prompt GPT-5 to use a more intensive reasoning mode. 

This will unlock both speed and quality of results for users across the board. This can only lead to more integration of ChatGPT into workflows and daily lives. 

That's why I'm bullish on this release, despite what the haters are saying about OpenAI trying to save money, broken workflows, use cases with lower quality answers, etc. 

Nothing is perfect with any model release. But they can fix the flaws over time. What matters is the strategy behind it, which in my opinion is the right one for the times. 

We need to go faster into the Age of AI and ChatGPT-5 can accelerate everything. 

To read more of my articles on AI, visit my Articles Page on my website.

If you disagree, I welcome your comments and look forward to the discussion.

• Skip Balch

Improving Sales Teams Odds of Winning | Trust Before Transaction | Grace▪︎Gratitude▪︎Generosity | Speaker | Teacher | “Nothing Happens WITHOUT a Conversation”

1mo

Thanks again for an explanation that is understandable and relevant. Long time follower, first time subscriber. Get well my friend!

Like
Reply
Kate A Larsen

Sustainable SupplyChains. HumanRights. Enviro Protection. Dignity. 20+Yrs Experience, 11 China & of ForcedLabor. Advise, Train HRDD ESG, for DecentWork, Regeneration, Impact. Ft'd PRI, Cambridge Sustainability Leadshp

2mo

Thanks for the useful update. Why do we "need" to go faster into the Age of AI? (It's not that I don't think it can be super useful; I'm just curious what particular urgent needs you think it can meet). Thanks

Like
Reply
Godard Abel

CEO @ G2 | PEAK Entrepreneur

2mo

Compelling insight on the less obvious below the surface ice berg value of ChatGPT 5 with much less hallucination and more research precision 🤔 🤖

Like
Reply
George Burch

Semantic AI @AICYC | Executive Chairman @ IKNOWit.WORLD | CEO at INTELLISOPHIC.

2mo

I don't so much disagree but generative AI is just one well understood process that relies on a search seeking the lowest points in a high dimension space that does not include salient data regarding human text production. For example humans organize text into domains of knowledge they are familiar experts in. Core knowledge in text books have table of contents and use type face as indicators of meaning of the sextions they write about. Chat GPt and every model (a) wipes out the domain boundaries when they tokeniizes the text. Some sentences in text are true and others are deliberate misinformation. These differences are lost in tokenizing. Post training LLM builders are soending billions on sweat shop annotation like like Scale AI to monopolize HU after the fact. Just semanticly annotating meaning of ine sentence among the 600 billiion would take a hundred years based on Scale's cost of 80000 hrs to do 10 miliion annotate week. Sounds a lot and costs a lot ($10-$100 an hour). But a single sentence requires at least a 1000 annotations to get through 600 billion. Thats a doomed fail if the goal is fixing training that already cost billiions. But if there is no other way perhaps there is no other way. cont.

Brian Silverman

Responsible for Helping organizations realize the Business Value of AI!

2mo

Tim Sanders, great post. The other breakthrough is the decreased cost and increased efficiency, which is a huge change from previous models, which I wrote about this morning.. https://www.linkedin.com/posts/absilverman_gpt5-gpt5-artificialintelligence-activity-7361388155507232768-mRcR?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAFG2NQB5YR5pjapz9qCmuCcwJRHaOy3M7w

To view or add a comment, sign in

More articles by Tim Sanders

Others also viewed

Explore content categories