Self-sufficient AI
No, we don't "have AGI already." But in any case, we should articulate clearer milestones.
Happy New Year! Planned Obsolescence, an occasional newsletter about AI futurism edited by Ajeya Cotra, has moved to Substack.
Every so often, there’s Discourse about whether we already have artificial general intelligence (AGI). For example, Dean Ball recently claimed that Claude Opus 4.5 was basically AGI, or at least met OpenAI’s definition of AGI. Many AI x-risk people, including me, pushed back on this.
OpenAI’s definition of AGI is “highly autonomous systems that outperform humans at most economically valuable work.” Wikipedia defines AGI similarly, as “a type of artificial intelligence that would match or surpass human capabilities across virtually all cognitive tasks.” I don’t in fact think Claude Opus 4.5 meets these definitions. There are a number of useful cognitive things people do that Opus 4.5 cannot yet do (e.g. managing a profitable vending machine).1
But at the end of the day, I’m not going to fight you much if you want to say “AGI” is a vague and poorly defined term. And if you want to say we already have AGI — well, I’d disagree, but it’s not the most interesting fight. We have certainly constructed artifacts that are pretty general and pretty intelligent, and they are very rapidly getting much more capable. In the LLM era, the term “AGI” practically begs to be watered down. I try to avoid using it.
A sharper milestone
Let me take a stab at defining a different milestone that’s hopefully more concrete and less debatable: a completely self-sufficient AI population. By this I mean a set of AI systems along with enabling physical infrastructure (e.g. the chips those AIs run on and the industrial stack that produces and powers those chips and robots that can build and maintain that stack) such that if every human being suddenly dropped dead, the AIs could keep making more copies of themselves indefinitely.2
This is similar to a working definition of AGI proposed by Vitalik Buterin last year, though this milestone is not just a matter of pure capabilities. We could develop an AI system that would be capable of self-sufficiency if deployed throughout the AI stack, but not deploy it extensively enough to realize that potential. Maybe humans continue to handle physical power plant construction and maintenance rather than letting the AI handle that autonomously by operating robots. In that case, if humans all suddenly died, the AI systems may struggle to quickly build and deploy the robots they’d need to maintain the power grid they rely on. (On the other hand, they may find a creative solution to this challenge.3)
On balance, I see the dependence on deployment as a feature rather than a bug of this forecasting target. I expect that as soon as AIs can genuinely handle every aspect of their own production autonomously, AI companies and fabs and fab equipment manufacturers will race to automate their own activities so it can proceed faster and cheaper.4 If you have a big disagreement with this, that represents a genuine and important disagreement about the future of AI.
What would it take?
AIs would need to possess a very broad array of very extreme capabilities to survive and grow with no living humans around — capabilities they clearly don’t possess today.
Just to avoid powering down, they would need to continuously battle entropy like our own bodies do, actively maintaining and repairing and eventually replacing the physical infrastructure and processes that sustain their existence. If the self-sufficient AI civilization runs on a hardware stack similar to what we have today, the work of battling entropy would look like AI systems operating swarms of robots to maintain the power infrastructure that keeps the chips that run their minds humming and the physical buildings those chips sit inside.
On top of that, they’d need to create more copies. Maybe at first they could just work on distilling themselves into smaller models or otherwise making their code more efficient so more copies can fit on the same hardware.5 But they’d eventually mine out pure software improvements, and would need to increase their physical footprint to grow further. That could look like operating robots to mine high-purity semiconductor-grade quartz from specialized mines, manufacture silicon wafers with that quartz, etch those wafers into chips with lithography machines, construct giant buildings to put those chips in, and build new power sources to power those chips.
To sustain growth over orders of magnitude until hitting physical limits, they’d need to route around lesser forms of scarcity, repeatedly figuring out new sources of power or raw materials to construct their brains and bodies from as existing solutions become unsustainable. They’d probably need to adapt to changes in the physical environment that could make survival more complicated, such as heating or pollution caused by their own industrial activities. They may have to proactively anticipate and protect against existential risks like geomagnetic storms.
All of this would require them to discover new science and invent new technology, eventually going far beyond the human frontier. The hardware stack would probably be unrecognizable by the end — perhaps eventually the AIs’ “code” (if you can even call it that anymore) will “run” inside microscopic machines similar to bacteria that can replicate themselves within hours using abundant elements like carbon and oxygen.
What I like about this milestone
You want to forecast different milestones for different purposes. I’m professionally interested in whether AI systems could take over the world.
For that purpose, it’s helpful that self-sufficient AI as a forecasting target is mechanically connected to the risk that misaligned AIs literally kill all humans, a classic and especially scary form of AI takeover.6 AIs would need to be self-sufficient before they actually wipe out every last person, or else they would be taking themselves down with us.
Of course, they could achieve self-sufficiency in part by manipulating and/or coercing some humans into providing the necessary infrastructure for them, perhaps in secret (as the misaligned AI system does in AI 2027). And it’s plausible that AI systems could effectively take over the world and maintain robust control while still depending on humans for key physical tasks. We certainly shouldn’t wait to implement alignment and control measures until we obviously have a self-sufficient AI population on our hands.
But I find that thinking in terms of “How would the misaligned AIs ultimately become self-sufficient?” inspires useful follow-up questions for forecasting very near-term AI takeover risk — if Claude Opus 4.7 tries to take over the world this summer, it would be trying to take steps toward a greater degree of self-sufficiency, and we could try to watch for signs of that.
More broadly, this operationalization of “the very powerful AI-related-thingie we’re counting down to” more viscerally conveys just how insane things could get than AGI or ASI or HLAI or related acronyms.
I think there might be a self-sufficient AI population within five years, and it’s more likely than not within ten. By which I mean if every human died of a super-plague in Q1 2036, our silicon descendants could probably keep living, growing, and evolving for centuries in our absence. I bet a lot of people who would say we already have AGI would think that’s an absolutely crazy view.
This is good. We need forecasting targets that accurately elicit the fact that people still have profound disagreements about the near future of AI.
Bloodless phrases like “cognitive tasks” and “virtually all” make people’s eyes glaze over, and it’s very easy for different people to interpret them in massively different ways. Ultimately the most reliable way to point at an extreme capability is to illustrate in detail the consequences that motivated why you wanted to forecast capability in the first place.
And I don’t think this is just for lack of the perfect scaffold or prompt or workflow optimization — AI agents still lag humans on some core cognitive capabilities, including learning on the job and flexible long-term memory, that explain why they have lower success rates on open-ended long-term projects even as they surpass human experts on one-shot tasks.
That is, until they approach hard physical limits. I expect that would involve colonizing space, but if you think colonizing space is likely to be impossible for whatever reason, then you can imagine growing until they hit the Earth’s carrying capacity for AI “life.”
I think there’s likely to be an ambiguous period where we won’t be sure whether there exists a self-sufficient AI population — where humans are doing some tasks here and there throughout the AI stack, but AIs do most of the R&D and there are a lot of robots running around and it’s not clear how irreplaceable the few remaining humans are exactly.
This is a longer discussion — and an open research problem — but I’m skeptical that regulatory barriers, cultural drag factors, or physical bottlenecks will delay widespread adoption within the AI industry by more than a couple years past the point when the raw capabilities are in place. This is not a highly regulated consumer-facing industry, and its culture is very tech-forward. AI companies are already aggressively attempting to automate as much of their own internal R&D as they can, and I’d guess chip designers (in some cases the same companies) are likewise attempting to use AI-assisted design tools wherever they can. To the extent other parts of this tech stack are not already automating themselves as fast as they can, AI companies can try to vertically integrate. If it takes another several years for all the capabilities necessary for self-sufficiency to be developed, I expect that the AI stack will already be heavily automated with prior AI systems, and it’ll be quick to integrate the latest generation into those workflows.
In reality I expect a self-sufficient AI civilization would be able to quickly train much more capable AI systems, not just improve the efficiency of copying and running the original population — that is, I expect they would engage in an intelligence explosion. But I’m setting that aside for the sake of this definition, which I want to be a minimal threshold.
Not all AI takeover necessarily involves human extinction. You could try to forecast something even more direct, such as "AI systems could take over the world if they were working together," but that is much more confusing, in part because what counts as "takeover" is confusing — I find that self-sufficient AI strikes a good balance of being relatively well-defined while also being relatively closely connected to the core threat model.


Really great essay, appreciate you writing it. There's a nice self-containedness too that it's about whether AI can do a set of tasks, without needing to think through how humans respond.
That's in contrast to considering, say, "when is AI capable enough to meaningfully threaten people's livelihoods," which requires a bunch of economic theorizing and considering other dynamic human choices (what is our preference for labor from other humans, etc).
By self-sufficient I presume you mean "Earth absent of humans". But otherwise similar to the Earth now. Not "a box with gpu, vram, power, sensors, actuators floating in interstellar space surrounded by darkness and cold". 😊 To replicate, make copies of itself (maybe imperfect ones like we do), it would need to be alive in a way life forms of carbon (and water?) are: low power, hardware and software one and the same, or at least entangled. Don't see that particularly more advantageous (from AI-s PoV) compared to now. Where us HI-s, that are analogue and mortal but low power, bootstrap and boot-up AI-s, that are digital and immortal, but use much more power. There is no escaping dependence on someone or something from the environment, outside self.