Y
Hacker News
new
|
ask
|
show
|
jobs
Simple, zero overhead way to compress model, KV cache via Low-Rank Decomposition
(
jeffreywong20.github.io
)
1 points
by
thw20
37 days ago