|
|
|
|
|
by anemll
82 days ago
|
|
Thanks for posting this, that's how I first found out about Dan's experiment!
SSD speed doubled in the M5P/M generation, that makes it usable!
I think one paper under the radar is "KV Prediction for Improved Time to First Token" https://arxiv.org/abs/2410.08391 which hopefully can help with prefill for Flash streaming. |
|