|
|
|
|
|
by amrrs
505 days ago
|
|
This has been the problem with a lot of long context use cases. It's not just the model's support but also sufficient compute and inference time. This is exactly why I was excited for Mamba and now possibly Lightning attention. Even though the new DCA based on which these models provide long context could be an interesting area to watch; |
|