Discussion about this post

User's avatar
Ashwin's avatar

Great post, ty!

The Pokemon prediction seems harder to me than the other 80% predictions, though maybe that's just because I saw an early Claude Plays Pokemon and was surprised by how many basic things tripped it up. Something something "real world complexities are surprisingly tricky to address"? I think recent models do substantially better, but still get tripped up in silly ways. Maybe this gets solved by continuous progress in a few areas of weakness, though: integrated image processing + longer context windows + better "notes-to-self" writing.

Math feels like the cleanest task, in the sense that there's no surprising / spiky environmental features to process, it's "natively" a thing you can do via text stream.

VN design also feels intuitively easy to me, maybe the main need here is again just bigger context windows and better self-management. I should play with Claude Code and see what trips it up here!

How do you plan to handle the incoming era of amazing video games on tap? Plug your ears with wax, or try to harness it for The Good by generating gripping edutainment for yourself about AI progress?

davik's avatar

Although it is somewhat against the spirit of empirical laws whose effect comes from aggregating large # of independent causes, I would like to ask people here to speculate on what sort of human tasks are >24h time horizon but cannot be easily delegated to a group of humans each having 24h windows + notes from a manager (which can be replicated from 24h agents)

I assume such tasks must somehow involve learning + memory on task-specific subtasks of a form which is for example not easily learned/transferred from an instruction manual + short time of practice. For example, a physical task may ask you to become proficient at using a new type of machinery, whose proficiency cannot be easily attained from short-time-scale practice. But for cognitive tasks it is kind of more difficult for me to understand the type of task that requires experience-based learning. Perhaps if the task has a subtask which involves learning a novel subject/novel programming language etc.

4 more comments...

No posts

Ready for more?