| HN Mirror

In unstructured pruning, individual weights of a weight tensor are pruned with no constraints on their position in the tensor. In structured pruning, there are constraints on which weights are pruned. Think of pruning an entire output channels of a convolutional layer vs pruning arbitrary weights of the same tensor. Unstructured pruning allows for a greater percentage of the weights to be pruned, but unstructured pruning doesn't speed up a network on hardware accelerators, which favor dense matrix multiplications.