Hacker News new | ask | show | jobs
by seanhunter 1235 days ago
I understand that, but that is an extremely banal observation if you think about it, because the fact that there is this incredible emergent behavior from a simple starting system is the heart of the mystery here.

One of the things that everyone is sort of skipping over is the "sufficient training" part. There is no bootstrap reinforcement learning possible for AGI. You can't alphago this sucker and have it play simulations against itself because the whole idea of generality is that there isn't a simple rule framework in which you could run such a simulation. So training any kind of AGI is a really hard problem.

2 comments

hes specifically answering the question of why he thinks he has any chance of success doing this independently when there are giant organizations funding this.
There are ways that LLM's can self-improve, such as in this paper: https://arxiv.org/abs/2210.11610

I would speculate that there are more ways to train on logical consistency of the output, and improve the models further.