Hacker News new | ask | show | jobs
by esc_colon_q 2125 days ago
It is very true that GPT-XYZ is fundamentally limited, like all transformer models, but don't misattribute the root cause to the fact that it's just processing text.

The real limitation is that these are feedforward networks that just do perception without any processing of what they've perceived. You can try to hide that fact for a while by increasing the depth of the perception network, basically hard-coding some processing into the single pass, but you're still not capturing any of the "absolutely requires self-feedback" behavior that we care about that takes a human more than a split second to do (aka almost all actual thought).

A statistical model that made good use of what was coming in could absolutely learn (to take your example) math at every level based on nothing but text, transformers are just nowhere close to having that capability because of their design limitations.

1 comments

But the fact that it 'lives' in a world of text is much more fundamental. There is simply not enough information in text to draw a working model of the world. We presuppose such a model in our words and in our communication, so fundamentally any algorithm that is solely trained on text can't learn a model of the world from it.