| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ajtulloch 4212 days ago
	This paper describes one part of the fbcunn release (the fast convolution layers implemented via FFT, with the source available at https://github.com/facebook/fbcunn/tree/master/src/cuda/fft). There's a lot more in fbcunn if you want to check it out.

1 comments

Hydraulix989 4212 days ago

In my experience, the fully connected layers are the bottleneck. The other issue was the alternating compute-heavy convolution and the IO-heavy pooling. I'm curious how this FFT implementation stacks up against cuDNN (what's the speedup like for just the convolutional layers? and then what's the overall speedup like?).

link

ajtulloch 4212 days ago

http://arxiv.org/pdf/1412.7580v2.pdf compares the convolutional implementation with the cuDNN layers. For the FC layers, it's just CuBLAS `sgemm`.

link