Hacker News new | ask | show | jobs
by throwawaymaths 1086 days ago
That doesn't make sense. If you have loop with length you have to check both the content of the byte and the index; if you have null terminated strings you only check the content of the byte.
1 comments

When you have the length, you can unroll the loop, so that you e.g. do 4 iterations at a time. With NUL you can’t do that. Moreover, loop iteration can be done in parallel (instruction-level parallelism) with processing the content of the string, since there is no data dependency between the two. With NUL you introduce a data dependency.
You can't always do those things. Yes, pointer length is almost always faster. But it's not always faster.

https://lemire.me/blog/2020/09/03/sentinels-can-be-faster/?a...

You can. Here is my version: <https://godbolt.org/z/91KecEfbM>

Results:

   N = 10000
  range 1218.1
  sentinel 937.258
  mine 672.227
  ratio 1.29964
(ratio is range/sentinel, not mine/whatever)

You can get even more crazy with SIMD. But for all that you need to know the length beforehand.

Edit: The b4 variable should actually be called b8. That's a remnant of a previous version, where I used 32-bit chunks.

Read the blog post. Dan lemire isn't exactly a slouch.
I read it. I don't see your point. Sentinel values simply give you a very low performance ceiling that you can't optimise past. What I wrote only took me a couple minutes. It's far from optimal (but it already takes 2/3 the time the sentinel version takes, and 1/2 the time Lemire's range version takes). With thorough effort I bet you could get at least 10 times faster than that, provided you are given the length. With sentinels, you're just left with the original performance levels you can't improve.