|
|
|
|
|
by lightcatcher
1811 days ago
|
|
Parallel prefix sum is the most underappreciated parallel algorithm in my opinion, and this paper is the best explanation and visualization of the concept I've seen. A few years ago I worked on a deep learning project using parallel prefix sum as a new way to accelerate recurrent neural nets on GPUs[0]. The paper in this post was the most important reference and source of inspiration. I'm happy to see this paper shared on HN in hopes that it also sparks ideas in others. [0] https://arxiv.org/abs/1709.04057 |
|