Hacker News new | ask | show | jobs
by glangdale 1876 days ago
This must be some strange new definition of "branchless" with which I'm not previously familiar; the generated code is full of branches, conditional or otherwise. I don't think invoking some other tool to put branches into your code qualifies you as branchless.

Have you compared your parser to simdjson-go? I haven't looked specifically at the go rewrite, but I hear it's decent.

I was very excited to do branch free coding to parse JSON, but really only handled the 'lexing' portion of the task in branch free fashion. Certainly more of the task could be done branch free, especially if you are working on SIMD or GPGPU.

1 comments

I apparently had a fundamental misunderstanding of the meaning of branchless. That's embarrassing.

As for simdjson-go, I did benchmark it. rjson outperformed simdjson-go in most benchmarks, but simdjson-go was about 3% faster reading citm_catalog.json.

https://github.com/WillAbides/rjson#simdjson

Edit:

I see on your bio that you are one of the simdjson authors. I hope you will indulge a question about it. I am generally more interested parsing a large volume of json documents efficiently than I am in doing it quickly. Since simdjson uses parallel instructions, would that mean that the speedup comes at the expense of being able to process more documents in parallel on the same hardware?

I'm not sure I believe your benchmarks, honestly. They are way slower than the C++ version. But I don't really have a horse in the race as simdjson-go is its own thing. Still, I'm mildly surprised that a byte-by-byte parser is the same speed as a SIMD approach, and usually when someone tells me this, the answer is usually that they buggered up the benchmarking. I would suggest comparing your numbers to the simdjson paper or whatever brag numbers simdjson-go has as a reality check; one of us is clearly wrong.

The use of parallelism in simdjson is the use of SIMD instructions, not multiple cores. So the answer is "no" to the second question.

For anybody who stumbles across this and is interested, this conversation is continued on reddit.

https://old.reddit.com/r/golang/comments/mlhvx0/i_wrote_yet_...