|
|
|
|
|
by mumumu
1243 days ago
|
|
This is true on a few streaming application (such as parsing). And most of the speedup is because of tricks to avoid doing beaches. There is a great blog post from one of the authors of JSON SIMD discussing this. I'm on mobile, there is a link for the blog post on the simd JSON github repository. |
|
The blog post I mentioned:
Paper: Parsing Gigabytes of JSON per Second https://branchfree.org/2019/02/25/paper-parsing-gigabytes-of...
Another related post from Lemire:
Ridiculously fast unicode (UTF-8) validation https://lemire.me/blog/2020/10/20/ridiculously-fast-unicode-...
Those algorithms are fast. But to put them in perspective. A single x86 CPU can write 64B per cycle. At 5GHz, the theorical maximum bandwidth is 320 GBps. IIRC, the read bandwidth is twice that.
There are others botlenecks, and is very hard to write code that writes at every cycle.
A interesting consequence, is that the theorical maximum bandwidth is logarithmical to the number of cycles. Again, talking about branchless streaming application.