Hacker News new | ask | show | jobs
by giacaglia 2482 days ago
A bottleneck can be running the model only on GPU, where CPU is more efficient. But most of the bottlenecks are memory issues. GPUs do not necessarily have enough memory and so you end up having to access "external memory" that slows down forward pass a ton
1 comments

Also, in some cases like small RNN/LSTMs, CPU's can be faster.