Hacker News new | ask | show | jobs
by pumanoir 1121 days ago
Optimization by gradient descent is used to do the learning in deep learning. For example, diff eqs are used to create optimizers that improve upon the classic 'adam' say, such as the new 'sophia' [1]. 1. https://arxiv.org/abs/2305.14342