Hacker News new | ask | show | jobs
by the8472 2339 days ago
> Literally billions of dollars have been invested in building systems like GPT-2, and megawatts of energy (perhaps more) have gone into testing them

Huh, seems like the bot that produced the article lacks some understanding about the real world. Maybe it just needs more training until it learns to associate megawatts with power instead of energy.

Meanwhile GPT2 completes this sentence to

> Literally billions of dollars have been invested in building systems like GPT-2, and megawatts of power generation to support this project.

On a more serious note GPT2 doesn't learn, it can't iteratively explore the world, doesn't experience time or associate those words with other stimuli or anything like that. Given these and more limitations it's fairly impressive what it does. It's like a child reading advanced physics books without the necessary prior knowledge. Being able to form a coherent-seeming sentence of jargon is all you can expect from it. Of course the path to AGI is long.

1 comments

GPT2 does learn, right.

I wonder how much of our knowledge of math is self-attention and how much is something else.

For example, much of what I do when I do calculus is mostly self attention. When I solve a calculus problem, I generally don't think through the squeeze theorem, but apply cookbook math.

My current model for the brain is consciously driven self attention. Ie, 80-90% of what we do is just self attention and our conscious brain checks to see how right/interesting it is around 10-20% of the time.

The key therefore really is training your brain on the right data.

This model I find explains quite a lot of things about people and the way they behave / succeed.

> GPT2 does learn, right.

I meant the usual restriction of current DL models where training and inference are separate. Humans update constantly. Think of code review, you have a model in your head what the code you have written does, a reviewer spots some mistake, your model was incorrect, you adjust and while you're at it fix the same kind of mistake in several other places too. GPT2 would be none the wiser. At best the human could prompt it for its top list instead of the most likely completion and see if it comes up with something more useful, but again, it wouldn't update its weights.

And a human can also figure out by how much we need to update, a low probability event means not much adjustment is needed, a serious error on the other hand needs bigger adjustments.

> My current model for the brain is consciously driven self attention. Ie, 80-90% of what we do is just self attention and our conscious brain checks to see how right/interesting it is around 10-20% of the time.

Well, sure, the brain has lots of low-level automation. But the devil is in those "consciously driven" details.

The things that GPT2 doesn't have is some kind of iterative cognitive model, where text is continually modified and re-examined. It also doesn't have any integration with memory, both long term or short term.
That doesn't seem particularly hard to add.

I agree the conscious AGI stuff is the tricky part. But, then maybe it's not. Maybe it's not as clever as we think it is, and if you have a good enough self-attention model the AGI just needs to be symbolic logic.

I'm thinking something that'd pass a turing test, btw. Not something that's hyper smart.