Hacker News new | ask | show | jobs
by epolanski 1 hour ago
Chinese papers and techniques have been very influential and copied by US labs.

Multi-head Latent Attention (MLA), Multi-Token prediction, MoE architecture are some of the most famous examples.