| > Much of the magic inside of neural network libraries has less to do with cleverer algorithms and more to do with vectorized SIMD instructions and/or being parsimonious with GPU memory usage and communication back and forth with main memory. I mean… that’s not really fair is it? We’ve been able to build NN libraries for 30 years, but it’s the transformers algorithm on top of it, and the stacked layers forming a coherent network that are the complex parts right? Implement stable diffusion in clojure (the python code for it is all open source) and we quickly see that there is a lot of complexity once you’re doing something useful that the primitive operations don’t support. It’s not really any different from opencv with the basic matrix operations and then paper-by-paper implementations of various algorithms. Building a basic pixel matrix library using clojure wouldn’t give you an equivalent to opencv either. Is there really a clear meaningful bridge between building low level operations and building high level functions out of them? When you implement sqrt, you’ve learnt a thing… but it doesn’t help you build a rendering engine. Hasn’t this always been the problem with learning ML “from scratch?” You start with basic operations, do MNIST… and then… uh, well, no. Now you clone a python repo that implements the paper you want to work on and modify it, because implementing it from scratch with your primitives isn’t really possible. |