|
|
|
|
|
by turkeygizzard
1268 days ago
|
|
I'm pretty sure the GPT model is huge and does not fit on any conventional GPU. Even if they open-sourced the weights, I don't think most people would be running it at home. Also regarding the text limits, AFAIK, there's just an inherent limit in the architecture. Transformers are trained on finite-length sequences (I think their latest uses 4096 tokens). I have been trying to understand how ChatGPT seems to be able to manage context/understanding beyond this window length |
|
(Specifically, AI Dungeon type games where ChatGPT is the DM and the human the protagonist, or vice versa. The most common failure mode seems to be that it forgets whether it's playing the DM or the protagonist. To be fair, it performs admirably well despite the limitations.)