|
|
|
|
|
by alexbeloi
2918 days ago
|
|
My naive take on this is that it makes biologically sense to keep net expected electrical impulses approximately the same before and after strengthening. At the end of the day the brain has energy constraints. This can-be/is done functionally in ANNs but achieves a different end (avoids over-fitting) but doesn't reducing energy(compute) expenditure in dense ANNs since activation and non-activation is computed in expectation and take the same number of cycles in dense networks. I'd love to see more work on massive sparse networks, where you actually get compute efficiency if you can reduce number of activation without reducing hurting your optimization target. |
|