|
|
|
|
|
by romesmoke
575 days ago
|
|
TBH I don't know what you mean either :D I know that DSA is NP-complete, and none of the existing deep learning compilers implement the theoretical SOTA (you're probably referring to the 2+ε bound by Buchsbaum et al.) I also know that scaling laws and the memory wall are soon gonna overpower the currently used heuristics (e.g., sort by size and do best-fit). Anyways, I'm glad I "met" someone who has struggled against the same problem. Good luck with your research! |
|
Think about the ILP with disjunctions formulation of DSA. It's basically a set of linear equations right? In the case of DNNs, the equations are related: the sizes of the allocations in one layer (in say a CNN) are algebraically a function of the outputs of the previous layer. And so on all the way back to the inputs. So what looks like a set of linear equations in N variables is actually a much smaller set of nonlinear equations in K << N variables.
Okay non-linear equations are hard but that's not actually what I'm getting at - what I'm getting at is the instances of the DSA problem in a production ML pipeline are parametric. So I tried to solve this problem: given a DNN and its parameterized DSA instances, can you do some work offline (to solve "part" of the problem) so that at runtime there's not a lot of work left. And that question/problem/challenge falls into the bucket of FPT.
> Good luck with your research
I changed topics and graduated. No more research for me yay