Hacker News new | ask | show | jobs
by antome 2328 days ago
As someone who has used both PyTorch and TensorFlow for a couple years now, I can can attest to the faster research iteration times for PyTorch. TensorFlow has always felt like it was designed for some mythical researcher that could come up with a complete architecture ahead of time, based on off-the-shelf parts.
2 comments

Indeed, no wonder PyTorch has beaten Tensorflow so thoroughly in the last 3 years, going up from 1% of the papers to ~50% of the papers (TensorFlow is now down to only 23% of the papers):

https://paperswithcode.com/trends

According to the methodology on that page that would classify the standalone version of Keras (using from keras.models imports as recommended by the Keras docs) as "Other". (I tried finding source code to verify this, but couldn't find it)

And if that is correct, then I'd be astonished if the vast majority of the "Other" papers aren't Keras. I work in ML and I don't think I've seen a paper that didn't use PyTorch, TensorFlow or Keras in years.

And is that's the case then almost certainly there are more that use TF than PyTorch: Pytorch is 42%, TF is 23% but Other is 36%.

(In terms of biases, I hate working in Tensorflow, and much prefer PyTorch and Keras. But numbers are numbers).

Jax?
Are there any papers that use it for things other than demonstrating Jax? I can't think of one off the top of my head.

Perhaps I should have specified "papers outside those introducing new frameworks, or around speed benchmarking".

There are a bunch of interesting papers using custom libraries for distributed training, and ones targeted at showing off the performance of specific hardware (NVidia has a bunch of interesting work in this space, and Intel and other smaller vendors have done things too).

It's still early days for JAX, but there's neural tangents https://arxiv.org/abs/1912.02803 and reformer https://arxiv.org/abs/2001.04451 from iclr.
I agree about it being early days.

Reformer is a good example that I'd missed.

Neural Tangents is another paper demoing a framework.

Keras is pretty good unless you hit some custom loss function that needs to do operations that aren't defined in Keras' backend, then you suddenly have to switch over to write them in TensorFlow with some ugly consequences (sometimes you don't know which operations will be GPU-accelerated; slicing vectors to compute and aggregate partial loss functions with some complicated math formulas might force computation onto a CPU).