Hacker News new | ask | show | jobs
by threatripper 16 days ago
This is already done as much as possible by reordering and merging operations but transposition (explicit or implicit) is unavoidable for some operations.
1 comments

A good example for this is A + A^T; you can fuse the two operations but you cannot get around the access pattern of matrix transposition.