Hacker News new | ask | show | jobs
by nbrempel 712 days ago
Thanks for reading everyone. I’ve gotten some feedback over on Reddit as well that the example is not effectively showing the benefits of SIMD. I plan on revising this.

One of my goals of writing these articles is to learn so feedback is more than welcome!

3 comments

What's fun is that, as the use of SIMD in your example is useless, LLVM correctly completely removes it, and makes your "neon" and "fallback" versions exactly the same - without any SIMD (compiler explorer: https://godbolt.org/z/YWoMGoaxT).

As an additional note, aarch64 always has NEON (similar to how x86-64 always has SSE2; extensions useful to dispatch would be SVE on aarch64 and AVX/AVX2/AVX-512 on x86-64), so no point dynamically checking for it.

Great read!

> One of my goals of writing these articles is to learn so feedback is more than welcome!

When I went into the Rust playground to see the assembly output for the Cumulative Sum example, I could only get it to show the compiler warnings, not the actual assembly. I'm probably doing something wrong, but for me this was a barrier that detracted from the article. I'd suggest incorporating the assembly directly into the article, although keeping the playground link for people who are more dedicated / competent than I am.

The function has to be made pub so it doesn't get optimized out as unusued private function.

Godbolt is a better choice for looking at asm anyway. https://rust.godbolt.org/z/3Y9ovsoz9

Narrator: "The code did not, in fact, auto-vectorise."

(There's only addsd/movsd instructions, which are add/move scalar-double; we want addpd/movpd which are add/move packed-double in vectorised code.)

Ah, that worked, thanks!

Although I can now see why he didn't include the output directly.

Are you really writing them?

Seems written by an LLM for the most part.