|
|
|
|
|
by amitport
45 days ago
|
|
Thanks for that!
Note that the residual chain is empirically and theoretically inferior to our unbiased scale; furthermore, it requires an additional bit in certain cases.
Additionally, TurboQuant was not the first to apply EDEN to KV-cache (see for example https://arxiv.org/abs/2411.17525 from 2024). |
|