|
|
|
|
|
by esc_colon_q
2125 days ago
|
|
It is very true that GPT-XYZ is fundamentally limited, like all transformer models, but don't misattribute the root cause to the fact that it's just processing text. The real limitation is that these are feedforward networks that just do perception without any processing of what they've perceived. You can try to hide that fact for a while by increasing the depth of the perception network, basically hard-coding some processing into the single pass, but you're still not capturing any of the "absolutely requires self-feedback" behavior that we care about that takes a human more than a split second to do (aka almost all actual thought). A statistical model that made good use of what was coming in could absolutely learn (to take your example) math at every level based on nothing but text, transformers are just nowhere close to having that capability because of their design limitations. |
|