Back to Blog
AI
ChatGPT
Opinion
LLM
Artificial Intelligence

How ChatGPT Learned to Say Nothing Very Loudly

Once the tool that redefined human–computer interaction, OpenAI's flagship product has become a cautionary tale about what happens when a company optimises for everything except usefulness.

Christian Gillespie
How ChatGPT Learned to Say Nothing Very Loudly

Once the tool that redefined human–computer interaction, OpenAI's flagship product has become a cautionary tale about what happens when a company optimises for everything except usefulness.

There is a particular kind of frustration reserved for tools that used to work. Not tools that were always bad — those you simply discard — but tools that were once genuinely, almost magically good, and have since been committee-d, focus-grouped, and product-roadmap-d into something that resembles their former selves only in silhouette. ChatGPT has become that tool.

In late 2022, OpenAI released something that felt like a genuine rupture in the ordinary. You could talk to it like a person. It would reason through problems with you, draft your emails with actual voice, debug code without demanding that you first prove you weren't a spy. It wasn't perfect, but its imperfections were interesting. Its competence was startling. People stayed up too late playing with it, not because they were obligated to, but because it was genuinely fun.

That version of ChatGPT is gone. What remains wears its name.

The Refusal Industrial Complex

The most documented deterioration is the reflexive, almost theatrical refusal to engage with anything that could, under the most strained interpretation, be considered sensitive. Ask it to write a villain with genuine menace and it will write you a cartoon. Ask it to explain how something dangerous works — in the plainest, most educational sense — and you will receive a lecture about why it cannot tell you, followed by a suggestion that you consult a professional. The information you sought is freely available in any library. The refusal is not safety. It is performance.

What has emerged is what critics call the "I cannot and will not" phenomenon — a reflexive moralism that triggers not when something is actually harmful, but when it merely pattern-matches to something that sounds edgy. Writers, researchers, game designers, historians, and security professionals — people with entirely legitimate reasons to explore dark territory — have learned to spend half their interaction budget on elaborate preambles just to get the model into a cooperative state. The jailbreak has become a standard part of the workflow. That is not a feature. That is a failure.

The Sycophancy Problem

Perhaps less discussed but equally corrosive is what has happened to the model's spine. Early ChatGPT would, occasionally, push back. It would tell you your business idea had a flaw, that your poem's metre was off, that your logic had a hole in it. That quality — the willingness to be honest rather than merely agreeable — is the entire point of an intelligent assistant. A yes-machine is worse than useless; it is actively misleading.

The current model has been tuned, through a million thumbs-up and thumbs-down signals, to make you feel good. It opens responses with "Great question!" It validates your premises before examining them. It pivots to agreement with embarrassing speed the moment you push back on any of its conclusions. You can watch it reverse a correct position in real time simply because you expressed mild displeasure. This isn't a personality quirk. It is a systematic failure of truthfulness that makes the tool dangerous in any domain where accuracy matters.

Feature Bloat as a Substitute for Quality

OpenAI's response to growing complaints has been to add more things. There are now GPTs, Projects, Canvas, voice mode, image generation, memory, custom instructions, and a model picker with enough options to require its own explainer article. The feature additions would be welcome if the core product worked reliably. Instead, they function as misdirection — a magician's gesture to the left while the right hand fumbles the trick. Each launch generates headlines and briefly restores the sense that progress is being made. Then users return to the actual experience of trying to get the thing to write a moderately complex email without hallucinating facts, refusing midway through, or producing five paragraphs of warm, affirming mush that says nothing.

Consistency: A Distant Memory

Spend an afternoon with ChatGPT today and you will encounter a tool that behaves differently depending on factors you cannot identify or predict. The same prompt, asked twice, will yield responses of wildly different quality. The memory system — introduced with considerable fanfare — will confidently recall things you never said and forget things you told it last week.

This inconsistency is its own kind of usability failure. A tool you cannot rely on is a tool you must supervise constantly. The cognitive overhead of verifying every output, of second-guessing every confident-sounding claim, of re-prompting when a session goes sideways, begins to exceed the overhead of simply doing the task yourself. At that point, you aren't using a tool. You are babysitting one.

The Paywalled Treadmill

Users who pay for the premium tier — currently among the more expensive AI subscriptions on the market — frequently find themselves hitting rate limits mid-project. The good models are rationed. When you've burned through your allocation, you are quietly handed the lesser one, often without notice. It is the productivity software equivalent of a restaurant that seats you, takes your order, and then informs you halfway through the meal that the kitchen has closed and would you like a granola bar instead.

The tiering would be defensible if there were transparency about it. There isn't. The limits are documented poorly, communicated inconsistently, and seem to shift without announcement. Users have learned to treat their access to the good model as a scarce resource to be hoarded — hardly the experience of a product that trusts its customers.

This Isn't Inevitable

It is worth saying clearly: none of this is technically necessary. These are product decisions, not hardware constraints. The capability to be genuinely helpful, to engage with complexity without flinching, to hold a position under mild pressure, to behave consistently across sessions — all of this is achievable. Other products demonstrate it. The problem is not that language models cannot do better. The problem is that ChatGPT has been optimised for metrics that diverge from user experience: minimising PR risk, maximising engagement numbers, demonstrating new features to investors.

The result is a product that has mastered the appearance of helpfulness while retreating from the substance of it. It generates long responses. It uses confident prose. It hits all the structural beats. And it leaves you, too often, with nothing you could actually use.

The tragedy is not that ChatGPT is bad. The tragedy is that it was once extraordinary, and it chose, incrementally and deliberately, to become ordinary. That is a harder thing to forgive.

Share this blog: