Hacker News new | ask | show | jobs
by andrew-wja 2744 days ago
Hi, author here. There is a staggering amount of low hanging fruit. I have been half-seriously blaming GEMM in correspondence. When you have a problem that looks like GEMM, it's such an attractive hammer to pick up that people just don't look beyond it to other techniques!

To answer your other questions: we already have auto load balancing and primitive fusion, albeit rudimentary, but optimizing scheduling is the obvious next step. We've extended this stuff to use ILP, and we're on our way to press at the moment!

Re: tree width: the tree widths are huge, but the solver library we're using handles them :)

1 comments

how tied are the compiler aspects to the specific kernels (eg for convolutional layers)? Could load balancing and fusion logic be broken out into a library which could work for user defined kernels?
They aren't tied at all -- in fact the optimizer is a totally separate project (triNNity-optimizer) that just does graph optimization. You can add user defined kernels as long as you have some way of microbenchmarking them!