| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mark_l_watson 2918 days ago
	Nice article, and I agree with the explanations of what makes Keras and TensorFlow best for specific use cases. Some history: I have used TensorFlow for years, switched to coding against the Keras APIs about 8 months ago. I wish I had more experience with PyTorch, but I just have the time right now to do more than just play with it. One suggestion to the authors: the benchmark figures are interesting, but I wish you had shown CPU only results also. At work, I have all the GPU resources I need but for my home projects, which are all NLP deep learning experiments, I usually rent a many core large memory server with no GPUs (GPUs seem to speed up RNNs less than other model types).

2 comments

visarga 2918 days ago

For most applications you can probably use a TCN (temporal convolutional network) instead of LSTM. TCN's are implemented in all major frameworks and work an order of magnitude faster because they are parallel.

https://arxiv.org/abs/1803.01271

https://arxiv.org/abs/1608.08242

link

droidist2 2918 days ago

I've been using a QRNN in PyTorch, is this similar to a TCN?

https://github.com/salesforce/pytorch-qrnn

link

visarga 2918 days ago

No, TCN is similar to WaveNet (dilated convolutions + masking the future + residual connections). It's a plain convnet, not an LSTM with a twist. That's why it runs efficiently in parallel on GPUs, like image processing convnets.

link

Smerity 2918 days ago

Actually, yes, the QRNN has all of those features.

First figure from our paper: how the LSTM with a twist allows for the equivalent speed of a plain convnet by running efficiently in parallel on GPUs, like image processing convents.[1]

Best of all, as it's only an "LSTM with (these) twists", it's drop-in compatible with existing LSTMs but can get you a 2-17 times speed-up over NVIDIA's cuDNN LSTM - essentially speed equivalent to the TCN or WaveNet speed-up.

That's why Baidu implemented QRNN in their production Deep Voice 2 neural text-to-speech (TTS) system[3].

This isn't to say TCN or QRNN is better, simply that it's dangerous to flat out say _no_ if you're not actually certain or don't correctly recall the underlying information.

Disclaimer: I'm the co-author of the QRNN.

Double disclaimer: The TCN paper cites the QRNN but decides not to test against it. They also show results over one of my datasets.

[1]: https://www.semanticscholar.org/paper/Quasi-Recurrent-Neural...

[2]: https://github.com/salesforce/pytorch-qrnn

[3]: https://www.semanticscholar.org/paper/Deep-Voice-2%3A-Multi-...

link

visarga 2917 days ago

Haha, small world. I knew about QRNN and even tried to find an implementation once to test it on my data. Neat idea.

link

mark_l_watson 2916 days ago

Thanks! I am trying TCNs this weekend.

link

Raf_ 2918 days ago

Glad you like the article and thanks for suggesting researching the CPU usage across these frameworks - it's something worth looking into.

link