What we're doing here

We’re trying to think ahead to a possible future in which AI is making all the most important decisions.

Audio automatically generated by an AI trained on Ajeya's voice.

I’m Ajeya; I work at Open Philanthropy, a grantmaking organization that aims to do as much good as possible with its resources, where I fund research in AI alignment (and try to think through which research might be most applicable to future, more powerful AI systems). I’m the editor of Planned Obsolescence, and for now it’s me and Kelsey (Senior Writer at Vox’s Future Perfect) writing articles.[1]

With this blog, we’re trying to think ahead to a possible future in which AI is functionally making all the most important decisions in our economy and society.[2]

We think that within a couple of decades, we’re likely[3] to live in a world where most of the R&D behind all the new innovations of much consequence is conducted by AI systems, where human CEOs have to rely on AI consultants and hire mostly AI employees for their company to have much chance of making money on the open market, where human military commanders have to defer to AI strategists and tacticians (and automate all their physical weapons with AI) for their country to stand much of a chance in a war, where human heads of state and policymakers and regulators have to lean on AI advisors to make sense of this all and craft policies that have much hope of responding intelligently (and have to use AI surveillance and AI policing to have a prayer of properly enforcing these policies).

We’ll refer to this future as “the obsolescence regime.” The obsolescence regime is a world where economic and military competition don’t operate on human timescales and aren’t constrained by human limitations — in this regime, a company or country that tries to make do with mere human creativity and understanding and reasoning alone would be outcompeted as surely as one that refuses to touch a computer would be today.

We’ve tried to think long and carefully about what the future might look like, but it’s hard to be really confident about much — the obsolescence regime may not come to pass anytime soon. But AI companies are spending billions of dollars working very hard to make AI systems better and better at understanding how the world works, coming up with creative solutions to difficult problems, making high-level decisions and anticipating their likely consequences, and interfacing with the real world through the internet. So far, they’ve had a startling degree of success — state-of-the-art AI systems have mastered benchmarks in math, science, coding, law, and more shortly after they’re introduced, and their commercial use seems to be quickly expanding. By default, Kelsey and I expect that AI companies will keep ramping up their efforts, and we think there’s a good chance they’ll keep seeing success. That means we could all be swept toward the obsolescence regime with disorienting speed.

This prospect is terrifying for many reasons.[4] One particular reason that we pay special attention to is the possibility of misaligned AI — the possibility that the AI systems effectively running our society may pursue their own goals, disconnected from what their human designers intended.

In the obsolescence regime, a huge volume of critical business and governance and military decisions would be made by AI advisors and employees without any human or group of humans having much understanding of why they were made, or playing much of a role at all other than rubber-stamping them. This means that if these ubiquitous AI advisors and employees happened to be pursuing their own misaligned ends rather than trying to carry out their designers’ instructions, it could be very easy for them to do things like lie to humans, siphon resources into their own control, cover up their tracks to evade monitoring and detection, and eventually seize control from humans in an open (and possibly violent) coup.

Would we actually end up training and deploying very powerful AI systems that were pursuing misaligned ends? Unfortunately, I think that if companies keep scaling up existing training techniques to more and more powerful AI systems, that would lead to misalignment and AI takeover by default.

We’ll be talking a lot about the alignment problem and the possibility of AI takeover in this blog, because we think it’s one of the most important and most poorly understood challenges we’ll face as we head into the obsolescence regime. But more broadly, we don’t think the world is remotely ready for the rapid proliferation of very intelligent and creative AI systems. We could use decades or centuries of experimentation and reflection and preparation to ease into this kind of change, but Kelsey and I think we probably don’t have that kind of time. Fundamentally, this blog is going to be trying to grapple with that.

  1. The views expressed on this blog are our own, and don’t necessarily reflect Open Phil’s or Vox’s views. ↩︎

  2. Whether or not some of these decisions technically require a human’s stamp of approval according to the law or custom. ↩︎

  3. Specifically, these days I (Ajeya) think that there’s probably a 50% chance that this will happen by the year ~2040 (17 years from the time of writing). ↩︎

  4. For example: How are we supposed to deal with most humans becoming unemployable, and the possibility of extreme concentration of wealth in the hands of a few capital holders? Could authoritarian states use powerful AI systems to cement their rule through ubiquitous surveillance? What should we think about AI consciousness and the ethics of using AI systems for our own ends? ↩︎