| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by umanwizard 1086 days ago
	If you have to walk the string anyway, the null terminator has no downside.

1 comments

zerodensity 1086 days ago

According to the bible (https://www.agner.org/optimize/) it's faster to use a loop with length than walking though a pointer so not having a length will make it slower to walk the string whole also making things like simd optimizations harder for the compiler to do.

link

throwawaymaths 1086 days ago

That doesn't make sense. If you have loop with length you have to check both the content of the byte and the index; if you have null terminated strings you only check the content of the byte.

link

GrumpySloth 1086 days ago

When you have the length, you can unroll the loop, so that you e.g. do 4 iterations at a time. With NUL you can’t do that. Moreover, loop iteration can be done in parallel (instruction-level parallelism) with processing the content of the string, since there is no data dependency between the two. With NUL you introduce a data dependency.

link

throwawaymaths 1086 days ago

You can't always do those things. Yes, pointer length is almost always faster. But it's not always faster.

https://lemire.me/blog/2020/09/03/sentinels-can-be-faster/?a...

link

yakubin 1085 days ago

You can. Here is my version: <https://godbolt.org/z/91KecEfbM>

Results:

   N = 10000
  range 1218.1
  sentinel 937.258
  mine 672.227
  ratio 1.29964

(ratio is range/sentinel, not mine/whatever)

You can get even more crazy with SIMD. But for all that you need to know the length beforehand.

Edit: The b4 variable should actually be called b8. That's a remnant of a previous version, where I used 32-bit chunks.

link

throwawaymaths 1085 days ago

Read the blog post. Dan lemire isn't exactly a slouch.

link