|
|
|
|
|
by microtonal
2122 days ago
|
|
Found the discrepancy. I use single precision in PyTorch. When I benchmark sgemm, the SSE code path is selected. Conclusion: MKL detects Zen now, but currently only implements a Zen code path for dgemm and not for sgemm. To get good performance for sgemm, you have to fake being an Intel CPU. Edit, longer description: https://github.com/pytorch/builder/issues/504 |
|