| The next major leap in LLMs (in the next year) is probably going to be the prompt context size. Right now we have 2k, 4k, 8k ... but OpenAI also has a 32k model that they're not really giving access to unfortunately. The 8k model is nice but it's GPT4 so it's slow. I think the thing that you're missing is that zero shot learning is VERY hard but anything > GPT3 is actually pretty good once you give it some real world examples. I think prompt engineering is going to be here for a while just because, on a lot of task, examples are needed. Doesn't mean it needs to be a herculean effort of course. Just that you need to come up with some concrete examples. This is going to be ESPECIALLY true with Open Source LLMs that aren't anywhere near as sophisticated as GPT4. In fact, I think there's a huge opportunity to use GPT4 to train the prompts of smaller models, come up with more examples, and help improve their precision/recall without massive prompt engineering efforts. |
Saw this article today about a different approach that opens up orders of magnitude larger contexts
https://hazyresearch.stanford.edu/blog/2023-03-07-hyena