|
|
|
|
|
by janalsncm
507 days ago
|
|
Why would you? Implementing optimizers isn’t something that MLEs do. Even the Deepseek team just uses AdamW. An MLE should be able to look up and understand the differences between optimizers but memorizing that information is extremely low priority compared with other information they might be asked. |
|