Hacker News new | ask | show | jobs
by throwawaymaths 1086 days ago
You can't always do those things. Yes, pointer length is almost always faster. But it's not always faster.

https://lemire.me/blog/2020/09/03/sentinels-can-be-faster/?a...

1 comments

You can. Here is my version: <https://godbolt.org/z/91KecEfbM>

Results:

   N = 10000
  range 1218.1
  sentinel 937.258
  mine 672.227
  ratio 1.29964
(ratio is range/sentinel, not mine/whatever)

You can get even more crazy with SIMD. But for all that you need to know the length beforehand.

Edit: The b4 variable should actually be called b8. That's a remnant of a previous version, where I used 32-bit chunks.

Read the blog post. Dan lemire isn't exactly a slouch.
I read it. I don't see your point. Sentinel values simply give you a very low performance ceiling that you can't optimise past. What I wrote only took me a couple minutes. It's far from optimal (but it already takes 2/3 the time the sentinel version takes, and 1/2 the time Lemire's range version takes). With thorough effort I bet you could get at least 10 times faster than that, provided you are given the length. With sentinels, you're just left with the original performance levels you can't improve.