|
|
|
|
|
by almostgotcaught
582 days ago
|
|
> TBH I don't know what you mean either :D Think about the ILP with disjunctions formulation of DSA. It's basically a set of linear equations right? In the case of DNNs, the equations are related: the sizes of the allocations in one layer (in say a CNN) are algebraically a function of the outputs of the previous layer. And so on all the way back to the inputs. So what looks like a set of linear equations in N variables is actually a much smaller set of nonlinear equations in K << N variables. Okay non-linear equations are hard but that's not actually what I'm getting at - what I'm getting at is the instances of the DSA problem in a production ML pipeline are parametric. So I tried to solve this problem: given a DNN and its parameterized DSA instances, can you do some work offline (to solve "part" of the problem) so that at runtime there's not a lot of work left. And that question/problem/challenge falls into the bucket of FPT. > Good luck with your research I changed topics and graduated. No more research for me yay |
|
1. Formulating DSA as ILP doesn't scale. TelaMalloc starts from this fact as motivation.
2. There's no 1-1 mapping between tensors and allocations. Buffer re-use is important.
3. There's room for offline DSA instances as well. IREE's Stream dialect has a related transform (LayoutSlices.cpp).
Anyways, I'm no ML expert. DSA is much more generic than deep learning compilers. I can't wait to graduate myself and never hear the involved keywords again.