If you thought we might be able to cure cancer in 2200, then I think you ought to expect there’s a good chance we can do it within years of the advent of AI systems that can do the research work humans can do.
The single most important thing we can do is to pause when the next model we train would be powerful enough to obsolete humans entirely. If it were up to me, I would slow down AI development starting now — and then later slow down even more.
If we’ve decided we’re collectively fine with unleashing millions of spam bots, then the least we can do is actually study what they can – and can’t – do.
Many fellow alignment researchers may be operating under radically different assumptions from you.
Perfect alignment just means that AI systems won’t want to deliberately disregard their designers' intent; it's not enough to ensure AI is good for the world.