Hacker News new | ask | show | jobs
by TimorousBestie 271 days ago
The single-thread performance of the parallel prefix sum that they use is O(N log N), so the improvement from that to O(log N) on N threads is not as surprising.

The way the headline is written, it sounds like Amdahl’s law was violated. It wasn’t, of course.

1 comments

How's the prefix sum on a single thread O(N log(N))? Isn't it trivially O(N)? It's just a for loop.
Yes, but for loop comes with all those data dependencies that prevent it from being parallelized trivially.

The algorithm with fewer data dependencies is O(N log N).

This is covered in more detail in the article.

It's from the depth of the computation, not the work