|
|
|
|
|
by varunneal
243 days ago
|
|
Muon was invented by Keller Jordan (and then optimized by others) for the sake of this speedrunning competition. Even though it was invented less than a year ago, it has already been widely adopted as SOTA for model training |
|
Both share equal credit I feel (also, the paper's co-authors!), both put in a lot of hard work for it, though I tend to bring up Bernstein since he tends to be pretty quiet about it himself.
(Source: am experienced speedrunner who's been in these circles for a decent amount of time)