Hacker News new | ask | show | jobs
by esafak 509 days ago
I'd still expect an MLE to know it though.
1 comments

Why would you? Implementing optimizers isn’t something that MLEs do. Even the Deepseek team just uses AdamW.

An MLE should be able to look up and understand the differences between optimizers but memorizing that information is extremely low priority compared with other information they might be asked.