|
|
|
|
|
by joshhart
905 days ago
|
|
If you are making many requests in batch this works ok because you can shuffle the next layer in while the current one is processing a set of matrix multiplies. This takes it from being a memory bound problem to a flops bound problem. This really only works if you care about throughput and not latency. |
|