|
|
|
|
|
by locuscoeruleus
1441 days ago
|
|
Adam was very effective when it got introduced so it was widely adopted. Since then only models that work well with Adam have made it from the idea stage to actually working. I think there's reason to believe we have over fit our model architectures to our loss functions and optimizers. |
|