Relative Superintelligence
Mark Pesce · University of Sydney · June 2026
Abstract
"Superintelligence" conflates three distinct phenomena: projected (the ELIZA illusion), relative (genuinely superhuman within a domain), and actual (superhuman across all domains). Relative superintelligence is already here. Steve Yegge's "discernment horizon" names the operational consequence: past some level of capability, no human can verify the output. Superhuman means unverifiable. The intelligence curve continues to steepen, but behind locked doors. For most users and most organisations, it has already flattened. The post-Watershed economy will operate at the frontier of verifiability, not capability: the triumph of the good enough.
Three Kinds of Superintelligence
The word "superintelligence" has been doing too much work. It gets applied to everything from chatbots that impress credulous users to hypothetical world-ending optimisers. The sloppiness matters, because the risks attached to each meaning are completely different.
There are three distinct phenomena hiding behind the word, and separating them clarifies what we face.
Projected Superintelligence
The ELIZA problem: In 1966, Joseph Weizenbaum built a simple pattern-matching program that mimicked a Rogerian therapist. Users, knowing it was a program, nonetheless attributed understanding, empathy, and intelligence to it. They filled in the gaps. The machine provided the frame; the human provided the picture.
Every generation of language model has produced its own wave of ELIZA effects. GPT-3 generated breathless claims of sentience. ChatGPT convinced millions of casual users they were talking to something that understood them. In each case, the "superintelligence" was a projection: the user's own cognitive machinery doing the heavy lifting, interpolating coherence from statistical patterns.
Projected superintelligence is not dangerous in the way the other two forms are. It is dangerous in the way any illusion is dangerous: it leads to misplaced trust. The user who believes the model understands them will trust its output in domains where it hallucinates freely. This is the left tail of the safety bell curve from "A Measure of Safety," the "not good enough" zone, made worse by the user's inability to see that the tokens are not good enough.
The cure for projected superintelligence is literacy. Understanding what the model actually does, rather than what it feels like it does, collapses the projection. The model has not changed. The user's expectations have.
Relative Superintelligence
The real thing, within a domain.
A model that can discover thousands of zero-day vulnerabilities that survived decades of human review is, with respect to vulnerability discovery, superintelligent relative to every human security researcher alive. A model that scores 97.6% on USAMO 2026 is, with respect to mathematical olympiad problems, superintelligent relative to nearly all human mathematicians.
This is Mythos. This is what "A Measure of Safety" calls "too good" tokens. And Steve Yegge, arriving at the same conclusion from the practitioner's side, has given it a name that cuts to the operational heart of the problem: the discernment horizon.
Yegge distinguishes two ceilings that every user hits as models improve.
The first is the demand horizon: the point at which the hardest problem you bring is not hard enough to tell two models apart. If all your problems are easy, Opus and Fable look the same. The demand horizon is benign. It just means your work is not stretching the model. Bring a harder problem and the horizon widens.
The second is darker. The discernment horizon is set not by the hardest problem you can pose, but by the hardest answer you can judge. Past this line, you cannot tell whether the model is right, because checking the work is itself beyond you.
Everyone has a discernment horizon. Even the CEO of Anthropic.
Past some level of capability there is no human alive who can verify the model output. As Yegge puts it: "Superhuman means unverifiable."
This is the operational definition of "too good" tokens, expressed from the user's perspective rather than the economist's. "A Measure of Safety" frames the problem in terms of harness containment: the harness cannot evaluate what it cannot comprehend. Yegge frames it in terms of human supervision: the human cannot verify what exceeds their competence. They are the same boundary, seen from two sides.
That boundary has consequences. You cannot hand out an intelligence engine that nobody can supervise. You don’t want the more powerful model. You want the safer one, even if it is less capable. The Goldilocks effect from "How Much Does a Token Cost?" arrives here by a completely different route: the most valuable model is not the most capable one, but the most capable one whose output you can still verify.
Actual Superintelligence
Not relative to some particular human, or some particular domain, or some particular harness. Superintelligent full stop.
We do not have this yet. We may not be far from it. We do not have the tools to know. Nor is it clear we will ever have such tools.
The distinction matters because actual superintelligence breaks the one escape hatch that relative superintelligence leaves open. With relative superintelligence, you can at least imagine a solution: use a more capable system to check the less capable one. A stronger model audits the weaker model's work. A specialised verification harness cross-checks outputs against known constraints. The verifier does not need to be smarter across the board; it needs to be smarter in the specific domain where verification is required.
Actual superintelligence closes that door. If the system exceeds all human capability across all domains, there is no human-accessible verifier. The only thing that can check a superintelligence is a greater superintelligence, and the humans commissioning the check cannot verify that one either. And so it goes into an infinite regress of unknowability.
This points toward a future that should give us pause: superintelligences of greater capability used to monitor superintelligences of lesser capability, on behalf of humans who are outclassed by both. The humans in this arrangement are not supervisors. They are clients, taking on faith that the greater intelligence is faithfully reporting on the lesser one. Trust without verification. The ELIZA problem, at civilisational scale.
We are not there yet. But the path from relative superintelligence to actual superintelligence is the same exponential curve that got us here.
The Flattening Curve
Yegge makes a prediction that maps precisely onto the framework developed in this series of papers. The intelligence curve, he argues, will appear to flatten for most users and most organisations, even as it continues to steepen behind locked doors.
Two forces produce this apparent flattening.
The first is restriction. Governments will lock down the most capable models the way they locked down enriched uranium. The chokepoint is the supply chain: hardware, compute, energy, the physical infrastructure that "The Mint" chapter of the Foundations paper describes. China will lock superintelligence inside its borders as hard as the United States will. The frontier continues to advance, but access to it narrows to a handful of state-adjacent actors.
The second is the discernment horizon itself. Even if you could access the frontier model, past some point you cannot tell what it is doing. The curve flattens for you because you have hit the ceiling of your own capacity to evaluate the output. A smarter model changes nothing you can measure, because measurement requires comprehension, and comprehension is what you have run out of.
The result: the post-Watershed economy will operate not at the frontier of capability but at the frontier of verifiability, recognised as ‘the triumph of the good enough’.
This is the same conclusion reached by three different paths:
"A Measure of Safety" reaches it through the bell curve of harness safety: the upper edge of the "good enough" zone is where the economics work.
"How Much Does a Token Cost?" reaches it through coordination costs: the total cost of deploying "too good" tokens exceeds the alpha they generate.
Yegge reaches it through practitioner experience: you want the safer model, not the smarter one, because the smarter one produces work you cannot grade.
Three independent lines of argument converging on the same boundary. That boundary is real.
Companies Have Horizons Too
Yegge extends the discernment horizon from individuals to organisations, and this extension matters for post-Watershed economics.
For plenty of companies, Fable-class models are already past the demand horizon. Every problem they have, the model handles, and a smarter model would change nothing they could measure. For harder shops, the binding limit is discernment: the AI produces work that nobody in the organisation can grade.
Both of these are stable states. The company past its demand horizon has no incentive to pay more for frontier tokens. The company past its discernment horizon has every incentive not to, because unverifiable output is worse than merely expensive. It is untrustable. And an organisation that cannot trust its own cognitive infrastructure is an organisation that cannot function.
This maps directly onto Gresham's Law as developed in this series. "Good enough" tokens drive out great tokens not just because they are cheaper, but because they are verifiable. The "great" token that exceeds the organisation's discernment horizon is not great at all. It is a liability. The rational actor chooses the token they can verify, the same way the rational actor chooses the token they can afford. In many cases, these are the same token.
The post-Watershed economy will stratify along the discernment horizon. Organisations with deep domain expertise, capable of verifying sophisticated output, will be able to extract alpha from more capable tokens. Organisations without that expertise will hit their discernment ceiling and stop. The ability to evaluate intelligence becomes as important as the ability to purchase it.
This is a new form of non-mintable alpha. You cannot buy discernment. You cannot mint it. It is accumulated through years of domain expertise, institutional knowledge, and hard-won judgment. The organisation that has it can safely use more capable tokens than the organisation that does not, extracting more alpha at every level of the capability curve.
Discernment is infrastructure.
Acknowledgements
This paper draws on Steve Yegge's "The Flat Curve Society" for the concepts of the demand horizon and the discernment horizon. It emerged from deep discussions with John Allsopp - with keen criticism from Tony Parisi - and was drafted by Claude Cowork from my extensive notes. I remain responsible for any errors that may have crept in.