Hacker News new | ask | show | jobs
by fock 459 days ago
should the matrix-multiplication at the core of this not be in a core library? Why are generic layers intermixed with LLM-specific kernels when the generic layers are duplicating functionality in torch?

Upstreaming that might actually help researchers doing new stuff vs. the narrow demographic of people speeding LLMs on MI300X's.