Hacker News new | ask | show | jobs
by bassp 243 days ago
Yes! There’s a canonical algorithm called the “Blelloch scan” for prefix sum (aka prefix scan, because you can generalize “sum” to “any binary associative function”) that’s very gpu friendly. I have… fond is the wrong word, but “strong” memories of implementing in a parallel programming class :)

Here’s a link to a pretty accessible writeup, if you’re curious about the details: https://developer.nvidia.com/gpugems/gpugems3/part-vi-gpu-co...

1 comments

Mm, I used that exact writeup as a reference to implement this algorithm in WebGL 3 years ago: https://github.com/ashtonsix/webglc/blob/main/src/kernel/sca...

It even inspired the alternative "transpose" method I describe in the OP README.