|
|
|
|
|
by andrew-wja
2741 days ago
|
|
The speedup is for the whole network, as the graph labels show! The point of the compiler is that you produce code that implements the entire forward pass so you can deploy that code where you need to do the inference. I agree we need AVX-512 -- hopefully I can get access to a SkylakeX machine in the next few days. |
|