|
|
|
|
|
by jmward01
376 days ago
|
|
This is definitely the right problem to focus on. I think the answer is a different LLM structure that has unlimited context. The transformer with causal masks for training block got us here but they are now limiting us in many massive ways. |
|