|
|
|
|
|
by akssri
2253 days ago
|
|
Common Lispers have had the ability to write CUDA kernels (with shaders) on-the-fly with a DSL in cl-cuda for many years now. See, https://github.com/takagi/cl-cuda Cupy (a project in which takagi is/was actively involved) also has the ability to compile kernels in the Python REPL, but the kernel code needs to be written in C and passed in as a string (pyOpencl can do the same for OpenCL). In fact, not too many years ago, the kernels for gradient descent, max pooling etc. were all entirely in Python in Chainer. Projects like JAX take this to another level by having the ability to transform the bytecode of a restricted class of Python functions straight into CUDA kernels. I can't speak much about Clojure/Neanderthal, but I'd advise the people to stop dissing projects they are not familiar with, least of all using silly benchmarks like the above. |
|
Yes, you showed the PyTorch code related to the blog post that confirms what the blog post says, but when I pointed out that the code has incorrect functionality (by missing some calculations) you didn't even bother to correct it, or to confirm that the code is good and that I'm wrong.
So, it seems that your standard is that it is enough for one side to throw bits and pieces around and call it a day, and for the other to run around and prove that their stuff is better than everything that could possibly be done in every technology.
I choose to stick to the theme. The theme is CuPy, NumPy, Clojure & Neanderthal. The related theme could be code in another technology. Great - write about it. But, even if every other technology were a million times better than what I describe in the article, it does not change the fact that CuPy and NumPy have the issue I've described.