Hacker News new | ask | show | jobs
by baybal2 2674 days ago
> We store strings as NULL terminated C strings. Thus we implicitly assume that you do not include a NULL character within your string, which is allowed technically speaking if you escape it (\u0000).

I lost count to broken JSON parsers which all fall to that.

1 comments

Yeah, this is unforgivable, and for me makes the whole speed argument void.

Edit: to be fair, they handle a couple of other things, which many similar libraries ignore. I particulary like the support for full 64bit integers. And at least they document their limitation on NULL bytes.

"Unforgivable" is a bit strong. I don't think this is something which invalidates our entire approach - nothing in the algorithm depends on this behavior as the \0 chars don't appear until quite late. Even then, we are not dependent on sighting a \0 in our string normalization and as such we can probably just store a offset+length in our 'tape' structure rather than assuming we have null terminated strings.

Please add an issue on Github.

Edit: I went ahead and added an issue. Seems like something we should fix.