| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Raf_ 2912 days ago
	Author here - the article compares Keras and PyTorch as the first Deep Learning framework to learn. It explores the differences between the two in terms of ease of use, flexibility, debugging experience, popularity, and performance, among others. If you have experience with learning, or teaching Deep Learning with PyTorch or Keras, we’d love to hear your thoughts about them.

4 comments

probably_wrong 2912 days ago

For what it's worth, here's my experience:

My adviser decided (wisely) that we all needed to learn NN, and we settled on Tensorflow. That went... poorly. I've told this before: the Seq2Seq tutorial was designed for an older version of TF, and it triggered a bug that was not fixed because that way to do Seq2Seq was deprecated and a new tutorial was coming "soon". The "tutorial" was also just a code dump with barely any comments.

Eventually we had new people coming in with even less theoretic background than ours (we had read papers for at least 6 months), and that's when we realised it would not work at all. So we organised a 1-week hackathon with Pytorch, and we've been using it ever since.

link

Al-Khwarizmi 2912 days ago

Similar story here. I got bitten by that very seq2seq "tutorial", lost a lot of time with it, and haven't used TensorFlow ever since except for reproducing other people's experiments. It's Keras, Torch, DyNet or PyTorch for me.

link

bitL 2912 days ago

I agree, I also use Keras for stable complex models (up to 1000 layers) in production and PyTorch for fun (DRL). However, if I want to run a distributed training optimization with minimum setup, whether I like it or not, the simplest way is to use TensorFlow's Estimator model and some pre-baked environment like SageMaker. Horovod or CERNDB/Keras require a bit more setup/devops work. The issue with estimators is that once you start using some bleeding-edge things in Keras, it might be very complicated to translate them back to estimators, despite conversion from Keras model to tf.Estimator being trivial.

link

jacquesm 2912 days ago

> I also use Keras for stable complex models (up to 1000 layers)

That sounds interesting, are you at liberty to say what you are doing?

link

bitL 2912 days ago

Complex computer vision classification tasks based on DenseNet/ResNet approaches; those often could be reduced in depth by some Wide ResNet technique. Keras is super easy there and you get a world-class performance after 1 hour of coding and a week of training, when you know what are you doing.

link

mlthoughts2018 2912 days ago

I mentioned in another comment [0], but also useful here: most of TensorFlow's tools for distributed model training or multi-gpu training will work out of the box directly on Keras, and distributed training is not at all a reason to directly use TensorFlow over Keras. At worst, you have to add in a tiny bit on TensorFlow code on top of the majority being in Keras, but you would still never need to write a significant amount directly in TensorFlow.

I also work on production systems built around deep ResNet architecture for computer vision tasks, and my team does this using solely Keras, including when we do distributed training.

Just adding this thought in case anyone mistakenly thinks you have to start out all-in using only TensorFlow because you might expect to need distributed training at some point.

[0]: < https://news.ycombinator.com/item?id=17416904 >

link

inputcoffee 2912 days ago

I appreciate how you focus on just two, which are the state of the art in your opinion.

I find it less useful to see comparisons of "top 50 deep learning frameworks for 2018" which include esoteric stuff that is only there for sake of completeness.

This way a person branching out from Tensorflow (I assume its Tensorflow) knows which two frameworks to try out, and what to look for.

link

entropie 2912 days ago

Article timeouts.

link

stared 2912 days ago

It seems to be a HN hug of death.

Though, as I see - it loads, though sometimes with a considerable delays (5-10 sec).

link