Muon Is Scalable for LLM Training

Y	Hacker News new \| ask \| show \| jobs

	Muon Is Scalable for LLM Training (github.com)
	5 points by renonce 476 days ago

1 comments

For people who want to know more about the Muon optimizer: https://kellerjordan.github.io/posts/muon/