Hacker News new | ask | show | jobs
by kcbanner 2326 days ago
This library can leverage data parallelism to increase throughout vs the scalar versions. ie. One instruction performs 4 operations instead of one. If the problem you are solving is suited to this parallelism, you could get a significant speedup.
1 comments

Sure, so why is that speedup not measured in the benchmark?