Hacker News new | ask | show | jobs
by flohofwoe 898 days ago
The problem was that the JSON parsing code was doing exceptionally stupid things in the first place (like calling scanf), which doesn't matter on small JSON files used in testing but exploded in production once that JSON file was growing to several MBytes. Just dropping in a different JSON parser library was probably enough to fix the problem on R*'s side.
1 comments

Why is it "exceptionally stupid"? sscanf is basically a slighlty more primitive regex engine than e.g. PCRE and I suspect it would work about as fast (if it weren't for that silly strlen() call) — and there are lexers that are basically just a loop with a match() call in it with

    (?P<NUMBER>?\d+(\.\d*)?)|(?P<ASSIGN>:=)|(?P<SEMI>;)|(?P<ID>[A-Za-z_][A-Za-z0-9_]*)|(?P<ARITH>[-+*/])|(?P<NEWLINE>\n)|(?P<WHITESPACE>[ \t\r]+)|(?P<MISMATCH>.)
as the pattern or something like that over the input string, and that is not generally considered to be a stupid way to write a lexer. Why would sscanf be?