| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jampekka 1936 days ago

OTOH PyTorch seems to be highly explosive if you try to use it outside the mainstream use (i.e. neural networks). There's sadly no performant autodiff system for general purpose Python. Numba is fine for performance, but does not support autodiff. JAX aims to be sort of general purpose, but in practice it is quite explosive when doing something other than neural networks.

A lot of this is probably due to supporting CPUs and GPUs with the same interface. There are quite profound differences in how CPUs and GPUs are programmed, so the interface tends to restrict especially more "CPU-oriented" approaches.

I have nothing against supporting GPUs (although I think their use is overrated and most people would do fine with CPUs), but Python really needs a general purpose, high performance autodiff.

7 comments

wxnx 1935 days ago

> I have nothing against supporting GPUs (although I think their use is overrated and most people would do fine with CPUs), but Python really needs a general purpose, high performance autodiff.

As someone who works with machine learning models day-to-day (yes, some deep NNs, but also other stuff) - GPUs really seem unbeatable to me for anything gradient-optimization-of-matrices (i.e. like 80% of what I do) related. Even inference in a relatively simple image classification net takes an order of magnitude longer on CPU than GPU on the smallest dataset I'm working with.

Was this a comment about specific models that have a reputation as being more difficult to optimize on the GPU (like tree-based models - although Microsoft is working in this space)? Or am I genuinely missing some optimization techniques that might let me make more use of our CPU compute?

link

jampekka 1935 days ago

For gradient-optimization-of-matrices for sure. Just make sure that you don't use gradient-optimization-of-matrices just because they run well on GPUs. There may well be more efficient approaches to your problems that are infeasible for the GPUs' wide SIMD architecture you may miss if you tie yourself to GPUs.

In general it's more that some specific models are easy for GPUs. Most models probably are not.

link

_coveredInBees 1935 days ago

I really don't understand the GPUs are overrated comment. As someone who uses Pytorch a lot and GPU compute almost every day, there is an order of magnitude difference in the speeds involved for most common CUDA / Open-CL accelerated computations.

Pytorch makes it pretty easy to get large GPU accelerated speed-ups with a lot of code we used to traditionally limit to Numpy. And this is for things that have nothing to do with neural-networks.

link

jampekka 1935 days ago

For a lot of cases you don't really need that much performance. Modern processors are plenty fast. It seems that current push to use GPU also pushes people towards GPU oriented solutions, such as using huge NNs for more or less anything, while other approaches would in many cases be magnitudes more efficient and robust.

GPUs (or "wide SIMDs" more generally) have quite profound limitations. Branching is very limited, recursion is more or less impossible and parallelism is possible only for identical operations. This makes for example many recursion-based time-series methods (e.g. Bayesian filtering) very tricky or practically impossible. From what I gather, running recurrent networks is also tricky and/or hacky on GPU.

GPUs are great for some quite specific, yet quite generally applicable, solutions, like tensor operations etc. But being tied to GPUs' inherent limitations also limits the space of approaches that are feasible to use. And in the long run this can stunt the development of different approaches.

link

mpfundstein 1934 days ago

> For a lot of cases you don't really need that much performance. Modern processors are plenty fast. It seems that current push to use GPU also pushes people towards GPU oriented solutions, such as using huge NNs for more or less anything, while other approaches would in many cases be magnitudes more efficient and robust.

for instance?

link

_coveredInBees 1935 days ago

I still don't get the criticism of Pytorch. If anything, you can get the best of both worlds in many way with their API supporting on GPU and on CPU operations in exactly the same ways.

link

ahendriksen 1936 days ago

What do you mean by “seems to be highly explosive”? I have used Pytorch to model many non-dnn things and have not experienced highly explosive behavior. (Could be that I have become too familiar with common footguns though)

link

lgessler 1935 days ago

I get what you mean by the GPUs are overrated comment, which is that they're thought of as essential in many cases when they're probably not, but in many domains like NLP, GPUs are a hard requirement for getting anything done

link

jl2718 1935 days ago

Have you tried using Enzyme* on Numba IR?

* https://enzyme.mit.edu

link

komuher 1935 days ago

Wait wat, jax and also pytorch is used in a lot more areas then NN's. Jax is even consider to do better in that department in terms on performance then all of julia so wat are u talking about

link

BadInformatics 1935 days ago

GP makes a fair point about JAX still requiring a limited subset of Python though (mostly control flow stuff). Also, there's really no in-library way to add new kernels. This doesn't matter for most ML people but is absolutely important in other domains. So Numba/Julia/Fortran are "better in that department in terms on performance" than JAX because the latter doesn't even support said functionality.

link

jpsamaroo 1935 days ago

> Jax is even consider to do better in that department in terms on performance then all of julia so wat are u talking about

Please provide sources for this claim

link

UncleOxidant 1935 days ago

> There's sadly no performant autodiff system for general purpose Python.

Like there is for general purpose Julia code? (https://github.com/FluxML/Zygote.jl)

> I have nothing against supporting GPUs (although I think their use is overrated and most people would do fine with CPUs),

Do you run much machine learning code? All those matrix multiplications run a good bit faster on the GPU.

link