| This aligns well with my personal experience using gpt-4. The model provides surprisingly good responses on topics which I know are readily available online while being potentially troublesome to find the exact information I want. I have even found it useful when I know there is a tool for what I want but can’t recall the jargon used to find it via Google. Simply describing the rough idea is enough to get the model to spit out the jargon I need. However, the moment I ask a real question that goes beyond summarizing something which is covered thousands of times online, I am immediately let down. Is this just a result of the foundation of the model being the world best autocompletion engine? My assessment is “yes” and I don’t believe that any of the modifications coming, like plugins, will fundamentally change this. |
Furthermore, I just don’t feel like the transformer architecture is suited for problem solving. Like I may just be a charlatan but self attention over the space of words does not seem like it’s going to be enough, and praying it falls out in emergent behavior if we can just add more parameters is… unscientific-ish? Now, if you could figure out a way to do self-attention over the space of concepts? Maybe you’ve got something.
I feel like AlphaGo ideas and some variation on MCTS is more likely to produce a solid problem solving architecture.