|
|
|
|
|
by wood_spirit
957 days ago
|
|
My own lessons from writing fast json parsers has a lot of language-type things but here are some generalisations: Avoid heap allocations in tokenising. Have a tokeniser that is a function that returns a stack-allocated struct or an int64 token that is a packed field describing the start, length and type offsets etc of the token. Avoid heap allocations in parsing: support a getString(key String) type interface for clients that what to chop up a buffer. For deserialising to object where you know the fields at compile time, generally generate a switch case of key length before comparing string values. My experience in data pipelines that process lots of json is that choice of json library can be a 3-10x performance difference and that all the main parsers want to allocate objects. If the classes you are serialising or deserialising is known at compile time then Jackson Java does a good job but you can get a 2x boost with careful coding and profiling. Whereas if you are paying aribrary json then all the mainstream parsers want to do lots of allocations that a more intrusive parser that you write yourself can avoid, and that you can make massive performance wins if you are processing thousands or millions of objects per second. |
|