Hacker News new | ask | show | jobs
by yorwba 165 days ago
https://arxiv.org/abs/2512.24880 was published less than two weeks ago, which should explain why it's not more common yet. And it's not that amazing either. It's a slight quality improvement for a slight increase in cost. It's not even clear to me whether it pays for itself.
1 comments

My bad, I took this as something Multi-head Latent Attention (MLA) related.