|
|
|
|
|
by londons_explore
2176 days ago
|
|
I experimented with networks where weights were removed if they did not contribute much to the final answer. My conclusion was I could easily set >99% of weights to zero on my (fully connected) layers with minimal performance impact after enough training, but the training time went up a lot (effectively after removing a bunch of connections, you have to do more training before removing more), and inference speed wasn't really improved because sparse matrices are sloooow. Overall, while it works out for biology, I don't think it will work for silicon. |
|