Hacker News new | ask | show | jobs
by hodgehog11 47 days ago
Couldn't have said it better, although this is only for grokking with the modular addition task on networks with suitable architectures. L1 regularization is absolutely a clear form of compression. The modular addition example is one of the best cases to see the phenomenon in action.