Hacker News new | ask | show | jobs
by jkeiser 2275 days ago
Streaming and selective parsing are good things, and something we're looking into for the next set of features.

Note that there are real speed gains to be had by not being selective. The CPU is normally sprinting ahead, executing future instructions before you even get to them, and every branch can make it trip and land flat on its face.

We've found we have to be very careful with every branch we introduce. I've tried to introduce branches that skip tons of code and ought to make things a ton faster, but which instead slow it down.

1 comments

That's of course true. Branching is costly and malloc is costly. But there is a need to filter objects and arrays in the 2nd part behind the parsing, in the conversion to your language/data. With a slower seperate function of course.

Parsing is the neglible fast part, converting it into your business objects is the slow part. This is usually done with the 2nd tape step. This involves a lot of malloc's, and you really want to skip the unneeded parts and filter it upfront.

Still thinking how to design my new JSON parser around simdjson.

Agreed, we're thinking it through too. Most parsing of JSON could be done almost without any second stage at all, or at least with a direct-to-user-struct/array, if you had the right API.