Most experts were surprised by progress in language models in 2022 and 2023. There may be more surprises ahead, so experts should register their forecasts now about 2024 and 2025.
Most new technologies don’t accelerate the pace of economic growth. But advanced AI might do this by massively increasing the research effort going into developing new technologies.
If you thought we might be able to cure cancer in 2200, then I think you ought to expect there’s a good chance we can do it within years of the advent of AI systems that can do the research work humans can do.
Once a lab trains AI that can fully replace its human employees, it will be able to multiply its workforce 100,000x. If these AIs do AI research, they could develop vastly superhuman systems in under a year.
Researchers could potentially design the next generation of ML models more quickly by delegating some work to existing models, creating a feedback loop of ever-accelerating progress.
The single most important thing we can do is to pause when the next model we train would be powerful enough to obsolete humans entirely. If it were up to me, I would slow down AI development starting now — and then later slow down even more.
If we’ve decided we’re collectively fine with unleashing millions of spam bots, then the least we can do is actually study what they can – and can’t – do.
Many fellow alignment researchers may be operating under radically different assumptions from you.
If we can accurately recognize good performance on alignment, we could elicit lots of useful alignment work from our models, even if they're playing the training game.
We're creating incentives for AI systems to make their behavior look as desirable as possible, while intentionally disregarding human intent when that conflicts with maximizing reward.