Hacker News new | ask | show | jobs
by manfredo 2470 days ago
If it's exclusively going to be used for heavyweight operations like these, it's probably better to benchmark against protobuf decoding. I guess using JSON has a "works out of the box" appeal, and doesn't require defining any protobuf schema. But personally I don't see defining proto files as too prohibitive in terms of development cost.
1 comments

> benchmark against protobuf decoding

Protobuf isn't built into the browser, so it can't bypass the JS parse & execute time. Instead you'd be parsing protobuf's JS, executing it, parsing proto, and producing objects. It'd be worth doing, sure, but it'd almost certainly be the slowest option by far since it's doing way more stuff in JS than either of the other two options and the JS syntax parse is the slow part.

These benchmarks indicate better protobuf performance [1]. Compute time these days is often dominated by memory transfer rates. The "slowness" of javascript seems to be offset by there being less data to begin with. Collapsing a 100KB resource down to, 50 or 25KB is usually worth it even if you have to do more operations in javascript. Not to mention end to end load time (which is probably what people are usually trying to optimize for) can be lower by reducing how much data needs to travel over the wire or radio.

At the end of the day, who knows if the use case hits edge cases or stresses parts of the implementation that is not optimized for JSON decode or protobuf. Getting meaningful performance data ultimately needs to be experimental, and resists categorical answers about whether X is faster than Y.

1. https://www.npmjs.com/package/protobufjs#performance

This article goes into a bit more detail: https://auth0.com/blog/beating-json-performance-with-protobu...

> These benchmarks indicate better protobuf performance [1].

We're exclusively talking about cold start performance here. Single, one-time object creation. Hence why JS syntax parse is the dominate factor and not execution performance. Those benchmarks are not that, they are hot performance. That's a completely different thing.

> Not to mention end to end load time (which is probably what people are usually trying to optimize for) can be lower by reducing how much data needs to travel over the wire or radio.

Wire transfer size would need to be looked at differently. The JS code & JSON string are both also going to be compressed unless you're not using a compressed Content-Type for some reason.

What is the "completely different thing" you're referring to here. Between:

1. Having a static JSON string, and decoding that string.

and

2. having a static blob, and using protobufs to decode that blob.

these two things accomplish the same thing. I'm not sure why you seem to think one is a "cold start" and the other is "hot" - they're both "single, one-time object creation". The former is going to be parsing ints and floats as ascii, and reading in "true" and "false". Regardless of compression, the memory-inefficient JSON encoding is going to be used (whether it's over the wire, or just as an intermediate representation during parsing). I've used protobuf decoding for things like localizations and configurations before - the "cold start" use case you're talking about - and it does in many circumstances result in faster loading. My napkin paper reasoning is that this will be much more heavily weighted to booleans and integers that are much more efficiently encoded in protobufs than JSON, so maybe if you had a use case that almost entirely decoded strings your performance differences may not be the same.

Are you including the cost of loading protobuf itself? You seem to be basing your argument on an assumed already present & loaded protobuf library.

You need to benchmark starting from nothing at all. Your link that you seem to be basing this off of has a loaded and fully JIT'd protobuf. That's not the start state.

You can measure the impact on loading time, and the size of the protobuf implementation you're using probably has an impact on the threshold at which it becomes more efficient. I don't doubt that parsing a 500 character long JSON string is probably faster than loading a protobuf to do it instead. In fact, apparently this JSON parsing trick is only effective beyond 10K or so. But past a certain threshold memory bandwidth is more crucial than loading code. If your data consists mostly of booleans and integers then JSON can often be an order of magnitude larger in size than protobufs. If it's compressed, then decompressing it takes clock cycles and the parsing code is still parsing the larger uncompressed JSON text. A protobuf library can often skip compression altogether by virtue of using normal ints and bits for numbers and booleans. So while the protobuf library does have some additional overhead it's often higher throughput for many types of data.