|
|
|
|
|
by namibj
2951 days ago
|
|
Compare to polyhedral optimization [0], supported in GCC via Graphite [1] and in LLVM via Polly [2].
These have a lower ceiling than hand-optimized assembler, but they automate the tuning of how to nest the loops for maximum benefits due to the cache hierarchy. Considering their ability to do so for general purpose number crunching loops, they are rather nice, but there is still some integration work needed, especially for LLVM, as lack a nice place in the existing optimization pipeline. [0]: https://en.wikipedia.org/wiki/Polytope_model
[1]: https://gcc.gnu.org/wiki/Graphite
[2]: https://gcc.gnu.org/wiki/Graphite |
|
[1]: https://arxiv.org/abs/1802.04730