Hacker News new | ask | show | jobs
by UncleOxidant 2448 days ago
Last year I was tasked with looking into a NAS (Neural Architectures Search) paper and analyzing the algorithm. The paper came with a TensorFLow implementation. Trying to read that TF code was quite difficult. I searched around and found a PyTorch implementation - much easier to read and understand, and it ran about 50% faster as well (the latter was a bit surprising). I tend to think that TensorFlow lends itself to the creation of code that's difficult to reason about. That may be different now with the various flavors of TF (like TF Eager).

I'll add that it was much easier to install PyTorch with GPU support than it was to install TensorFlow with GPU support - at least that's how it was around November of last year. The PyTorch install was painless, whereas we ended up having to build TF from source to work with out setup. Could be different now as I haven't looked at TF since then.

4 comments

> That may be different now with the various flavors of TF (like TF Eager)

Unfortunately, if anything I think it's the opposite. The constant creation and deprecation of TF flavors (tf-eager, tf-slim, tf-learn, keras, tf-estimator, tf.contrib [RIP]) has made reading tensorflow code online somewhat disastrous. Everybody, including the TF team, is using a different API and it's difficult to keep all of them straight. It seems that you're doomed to end up using some combination of many of the above in a way that makes sense to you and your team, adding another confusing model to the pile.

Agree overall, but tf.eager doesn't have much to do with the rest of the list.

tf.contrib is just a module where user-contributed code was stored, which included both low-level constructs and higher level APIs. tf.estimator is an abstraction that is mostly used for productionizing models. tf.slim/tf.learn were indeed redundant with keras (a library developped externally), but were necessary steps before keras became part of tensorflow.

What made me fall in love with PyTorch was also that the "neural network training process" is defined almost as it is in theory, in code in PyTorch

- loop through epochs

- loop through each batch

- run a forward pass for the batch ( model(batch) )

- calculate the loss for the batch ( criteria(y, yprim)

- compute the gradients/backprop ( loss.backward() )

- update the weights (optimizer.step())

This really enforced everything I learned and I think breaks down the problem. All this of course in addition to everything else already mentioned, and super convenient module/network building and definition.

Yes! Exactly.
I will add further to this point.

There will be bad model code in both PyTorch and TensorFlow. The difference is that, bad PyTorch code reads like bad Python code, and I've accumulated a lot of experience reasoning through bad Python code. Bad TensorFlow code can come from any one of of the history of paradigms that TF has gone through, and I don't even know if it's bad or just some funky new TF functionality I'm unfamiliar with.

> I don't even know if it's bad or just some funky new TF functionality I'm unfamiliar with.

Yes, Javascript features and frameworks have evolved quite fas... oh wait

Yes, installing Pytorch with Cuda support has been unbelievably simple. I've never had trouble with simply pip installing it in my venv.