Hacker News new | ask | show | jobs
by az226 726 days ago
Odd that they don’t expand on this:

In Yandex’s pre-trainings, the implementation of YaFSDP along with other memory optimization strategies resulted in a speed gain of 45%.