Hacker News new | ask | show | jobs
by abaines 2173 days ago
These are my results on an (admittedly old) Intel i3-2120 CPU @ 3.30GHz. Compiling both programs with -O3:

    ojc_parse_str    1000000 entries in 3607.615 msecs. (  277 iterations/msec)
    simdjson_parse   1000000 entries in  418.997 msecs. ( 2386 iterations/msec)
and might as well throw in my own parser...

    uj_parse   1000000 entries in 1959.731 msecs. (  510 iterations/msec)
The -O3 seems to make a large difference for simdjson.
1 comments

Shockingly different. The -O3 option made hardly any difference with OjC but more than a 10x difference with simdjson. I'll be removing the claim from the OjC readme.

Thank you for being civil with your reply. Much appreciated.

What I've learned from this (as a simdjson author) is that we need to update the quick start in the README to have -O3. I was so psyched about the fact that we now compiled warning-free without any parameters ... that I didn't stop to think that some people would go "huh I did what they told me and simdjson is slow, wtf." Because we evidently told you to compile it in debug mode in the quick start :)

simdjson relies deeply on inlining to let us write performant code that is also readable.

Sorry to have sent you down a blind alley!

One thing to note: if you want to get good numbers to chew on, we have a bunch of really good real world examples, of ALL sizes (simdjson is about big and small), in the jsonexamples/ directory of simdjson. And if you want to check ojc's validation, there are a number of pass.json and fail.json files in the jsonchecker/ directory.

The structuring of simdjson is considerably easier to read if we rely on -O3. In order to get the same performance at lower levels of optimization we need to do a lot of manual things that make the source code quite difficult to read and work with.