Hacker News new | ask | show | jobs
by alexhutcheson 746 days ago
Modern hash table implementations use vector instructions for lookups:

- Folly: https://github.com/facebook/folly/blob/main/folly/container/...

- Abseil: https://abseil.io/about/design/swisstables

1 comments

Sure; but it’s hard to do and very few programs get optimised to this point. Before reaching for vector instructions, I’ll:

- Benchmark, and verify that the code is hot.

- Rewrite from Python, Ruby, JS into a systems language (if necessary). Honorary mention for C# / Go / Java, which are often fast enough.

- Change to better data structures. Bad data structure choices are still so common.

- Reduce heap allocations. They’re more expensive than you think, especially when you take into account the effect on the cpu cache

Do those things well, and you can often get 3 or more orders of magnitude improved performance. At that point, is it worth reaching for SIMD intrinsics? Maybe. But I just haven’t written many programs where fast code written in a fast language (c, rust, etc) still wasn’t fast enough.

I think it would be different if languages like rust had a high level wrapper around simd that gave you similar performance to hand written simd. But right now, simd is horrible to use and debug. And you usually need to write it per-architecture. Even Intel and amd need different code paths because Intel has dumped avx2.

Outside generic tools like Unicode validation, json parsing and video decoding, I doubt modern simd gets much use. Llvm does what it can but ….

Indeed, people really fixate on “slow languages” but for all but the most demanding of applications, the right algorithm and data structures makes the lions share of the difference.
Reaching for SIMD intrinsics or an abstraction has been historically quite painful in C and C++. But cross-platform SIMD abstractions in C#, Swift and Mojo are changing the picture. You can write a vectorized algorithm in C# and practically not lose performance versus hand-intrinsified C, and CoreLib heavily relies on that.