| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kllrnohj 733 days ago
	Those 1-4 bytes are sitting in a register the entire time and thus basically free to read as often as you want, though. An actual sampled profile showing the two approaches would be interesting. Naively it seems like it's just because it has faster UTF8 handling and nothing to do with being a state machine exactly

1 comments

brainwad 732 days ago

According to the authors it's also faster on files full of 'x' or ' ', so there must be more than just better unicode support.

link

yencabulator 732 days ago

Even something as simple as wc calling a libc function for parsing utf-8, that doesn't get inlined, would destroy its performance relative to anything optimized.

Personally, I'd expect SIMD to win over all of these. wc sounds like kind of challenge that's very easy to partition and process in chunks, though UTF-8 might ruin that.

link

teo_zero 731 days ago

> very easy to partition and process in chunks

Which counters to increment at each byte depends on the previous bytes, though. You could probably succeed using overlapping chunks, but I wouldn't call it very easy.

link

yencabulator 731 days ago

That sort of "find the correct chunk boundary" logic was very common with all the mapreduce processing that was done when people still used the phrase big data.

link