|
|
|
|
|
by concurrentsquar
815 days ago
|
|
You don't (you have to use real-valued inertial 'latent weights' during training): https://arxiv.org/abs/1906.02107 (there is still a reduction in memory usage though (just not 24x): > "Furthermore, Bop reduces the
memory requirements during training: it requires only one real-valued variable per weight, while the
latent-variable approach with Momentum and Adam require two and three respectively.") |
|