Hacker News new | ask | show | jobs
by alfalfasprout 2555 days ago
Not to mention protobufs have awful performance compared to more modern alternatives in use today like Flatbuffers, Thrift, Cap'n Proto, SBE.

In the case of Google's own Flatbuffers, the layout is going to be far more performant.

3 comments

I think it's irrelevant. In fact the protobuf might be the best choice. If it was just defined as so:

  bytes key = 1;
  bytes value = 2;
... your overhead can be as little as 4 bytes and you can alias the memory of the key and value (using a type like std::string_view) instead of copying it. It takes a few nanoseconds to decode a message like this.
> protobufs have awful performance compared to more modern alternatives in use today like Flatbuffers, Thrift, Cap'n Proto, SBE

Do you have a source on that? Genuinely curious.

Hi, I wrote Protobuf v2 (the version everyone uses) and Cap'n Proto.

I don't know if I'd say Protobuf has "awful" performance. It's certainly much better that text-based formats like JSON. But the format is rather branch-y. You have to process it byte-by-byte, because e.g. integers are encoded in a variable-width encoding where each byte contains 7 bits of data plus 1 bit to indicate if this is the last byte. This results in a compact encoding, but takes a lot of cycles to encode and decode. Moreover, since everything is variable-width, in order to find any one field of the message, you must scan through all previous fields, parsing them one by one.

Cap'n Proto, FlatBuffers, and SBE all use "zero-copy" encodings, meaning the data is laid out on the wire in a format that is easy for a CPU to use directly. This means, for example, that integers are fixed-width, and fields are located at fixed offsets. This is must faster to parse (or even use in-place without parsing at all), but does result in somewhat larger encodings. (But then, you can always layer on independent compression when bandwidth matters more than CPU.)

My understanding is that Thrift is closer to Protobuf and contemporaneous with it, so I don't know why GP included it the list.

For simple protocols protobuf decoding has no taken branches. I.e. if you only use the first 15 field numbers (all your tags are 1 byte) and if all the types are the expected types, and if all the variable-length items are < 128 bytes long then you can decode the message without taking any branches. In C++. Most of the other languages have simpler and slower codecs.

This is the hot path in C++[1]. A really large amount of work has gone into protobuf C++ performance in the last 3 years or so.

1: https://github.com/protocolbuffers/protobuf/blob/master/src/...

And all your integer fields must be < 128, right?

Yes, I suppose the branches in Protobuf can be pretty predictable. Still, you do generally have to examine each byte individually.

Sure. In this specific case of a kv store it's hard to imagine how to simplify it dramatically from protobuf. As a proto you might have: tag-length-key-tag-length-value. Instead you could store the key and value lengths in host format using 8-16 bytes: length-length-key-value. It's not _dramatically_ faster to decode this, and you traded away extensibility to get a marginal speedup.
Sure, I was speaking in general, not specifically about the key-value case.

I think most serialization frameworks are likely to be overkill for such a use case, spending more time on setup than actual parsing.

Also note that storing the value (and maybe the key) with proper alignment might make it easier to use the data in-place, saving a copy.

Hit 'y' before copying the link; the line numbers have already shifted.
Which version of Thrift? Apache Thrift looks roughly the same as Protobuf on our benchmarks. Perhaps this is fbthrift?