| HN Mirror

The typical neural net matrix multiplication is N_EXAMPLES X N_FEATURES_IN multiplied with N_FEATURES_IN X N_FEATURES_OUT.

The output feature count is completely independent of the data size, and input feature count is only dependent on the dimensionality of the data (not the number of points), and that's only in the first layer of the network. Even with datasets with huge number of examples, the net usually only trains on a small "minibatch" of examples at a time, typically somewhere between 16 and 1024. This minibatch size is the algorithmic N_EXAMPLES. Given these numbers, the typical neural net matrix multiplication is probably something like (32, 256) x (256, 128). This is not nearly large enough for non-N^3 tmatmul algorithms o accelerate things.