|
|
|
|
|
by mathisfun123
1181 days ago
|
|
yup exactly; it's like other comments on hn about nn frameworks: "abstraction is the most important thing - look at pytorch it's the best framework because of the perfect/beautiful/brilliant abstractions" (re functorch or fx or dynamo). ignoring entirely how much tedious and grueling bookkeeping/corner-casing/kernel-tuning (by a perpetual 100s of fulltime engineers) presenting such an "abstract" interface to the user requires. |
|
Let's instead constructively talk about techniques in concrete items. If you look at OpenAI's Triton (which is also a small team of < 5 core contributors), what's this abstraction and their key to high performance? It's a tile-based programming model, where a tile could be conveniently lowered to vector instructions, coalesced memory access, and transformed to permuted layout. Its `dot` on tiles can be directly lowed to TensorCore-specific instructions. With those in design, without a huge team painfully maintaining the system, critical kernels like FlashAttention could be quickly developed within say 30 lines of code.