Hacker News new | ask | show | jobs
by dgasmith 1532 days ago
The looped matrix multiply that you show is very hard to optimize for in the general case of einsum. Often the looped GEMM is found permuted such as `kbi,kjb->bij`. In this case, heuristics are needed to determine if GEMM is worth it due to unaligned memory copies.

`optimize=True` is generally best when there are more than two tensors in the expression.