Hacker News new | ask | show | jobs
by zheng 4664 days ago
The claims on this site are pretty impressive, but I have close to zero knowledge of the history here, so can someone comment on how many grains of salt this should be taken with? Otherwise, this looks pretty cool. Something that beats protobufs in overall speed could be really helpful depending on the application.
4 comments

Just to give a bit of counterpoint, here are some trade-offs that Capn Proto makes compared with protobufs. (Full disclosure: I work at Google and know Kenton from his time here; I have my own protobuf library that I've worked on for several years). I'm sure Kenton will correct me if I get anything wrong. :)

Capn Proto's key design characteristic is to use the same encoding on-the-wire as in-memory. Protobufs have a wire format that looks something like:

  [field number 3][value for field 3]
  [field number 7][value for field 7]
  etc.
The fieldnum/value pairs can come in any order, and may define as many or as few of the declared fields as are present. This serialization format doesn't work for in-memory usage because for general programming you need O(1) access to each value, so protobufs have a "parse" step that unpacks this into a C++ class where each field has its own member.

Protobufs are heavily optimized so this parsing is fast, but it's still a very noticeable cost in high-volume systems. So Capn Proto defines its wire format such that it also has O(1) access to arbitrary fields. This makes it suitable as an in-memory format also.

While this avoids a parsing step, it also means that your wire format has to preserve the empty spaces for fields that aren't present. So to get the "infinitely faster" advantage, you have to accept this cost. For dense messages, this can actually be smaller than the comparable protobuf because you don't have to encode the field numbers. But for very sparse messages, this can be arbitrarily larger.

As Kenton points out on http://kentonv.github.io/capnproto/encoding.html , lots of zeros compress really well, so even sparse messages can become really small by compressing them. To do this you lose "infinitely faster", but according to Kenton this is still faster than protobufs.

In both cases though, the tight coupling between the (uncompressed) wire format and the in-memory format imposes certain things on your application with regards to memory management and the mutation patterns the struct will allow. For example, it appears that the in-memory format was not sufficiently flexible for Python to wrap it directly, so the Python extension does in fact have a parse step.

Other cases where you could need a parse/serialize step anyway: if you want to put the wire data into a specialized container like a map or set (or your own custom data classes), or if the supported built-in mutation patterns are not flexible enough for you (for example, the Capn Proto "List" type appears to have limitations on how and when a list can grow in size).

It's very cool work, but I don't believe it obsoletes Protocol Buffers. I'm actually interested in making the two interoperate, along with JSON -- these key/value technologies are so similar in concept and usage that I think it's unfortunate they don't interoperate better.

Generally a fair analysis. A few comments/corrections:

> For example, it appears that the in-memory format was not sufficiently flexible for Python to wrap it directly, so the Python extension does in fact have a parse step.

This is not correct. The Python wrapper directly wraps the C++ interface. You might be confused by Jason's claim that "The INFINITY TIMES faster part isn't so true for python", but this was apparently meant as a joke.

It is true, though, that the constraints of arena-style allocation (which Cap'n Proto necessarily must use to be truly zero-copy) mean that working with Cap'n Proto types is not quite as convenient as protobufs, although most users won't notice much of a difference. Lists not being dynamically resizable is the biggest sore point, though most use cases are better off not relying on dynamic resizing (it's slow), and the use cases that really do need it can get around the problem using orphans (build an std::vector<Orphan<T>>, then compile that into a List<T> when you're done).

OTOH, over the years, many people have requested the ability to use arena allocation with Protobufs due to the speed benefits, especially with Protobufs being rather heap-hungry. I always had to tell them "It would require such a massive redesign that it's not feasible."

And yes, there is the trade-off of padding on the wire. You have to decide whether your use case is more limited by bandwidth or CPU. With Cap'n Proto you get to choose between packing (removing the zeros, at the cost of a non-free encode/decode step) and not packing (infinitely-fast encode/decode, larger messages). For intra-datacenter traffic you'd probably send raw, whereas for cross-internet you'd pack. Protobufs essentially always packs without giving you a choice. And because it generates unique packing code for every type you define (rather than use a single, tight implementation that operates on arbitrary input bytes), Protobuf "packing" tends to be slower.

Thanks for the correction on the Python point.

> OTOH, over the years, many people have requested the ability to use arena allocation with Protobufs due to the speed benefits, especially with Protobufs being rather heap-hungry. I always had to tell them "It would require such a massive redesign that it's not feasible."

Yes totally, I agree that arena allocation is great. I think we both agree on this point, though we've taken two different paths in attempting to solve it.

Your approach is to say that arena allocation can be made pretty convenient, and sparse messages can compress really well, so let's design a message format that is amenable to arena allocation and then implement a system that uses this format both on-the-wire and in memory.

My approach is to say that we can solve this (and many other related problems) by decoupling wire formats from in-memory formats, and having the two interoperate through parsers that implement a common visitor-like interface. Then a single parser (which has been optimized to hell) can populate any kind of in-memory format, or stream its output to some other wire format. Of course this will never beat a no-parser design in speed, but the world will never have all its data in one single format.

I think of these two approaches as totally complimentary; to me Capn Proto is simply another key/value serialization format with a particular set of nice properties, and I want it to be easy to convert between that and other formats.

Since your approach is much more focused, you have been able to turn out usable results orders of magnitude faster than I have. I'm spending time implementing all of the various protobuf features and edge cases that have accumulated over the years, while simultaneously refining my visitor interface to be able to accommodate them while remaining performance-competitive with the existing protobuf implementation (and not getting too complex). As much as I believe in what I'm doing, I do envy how you have freed yourself from backward compatibility concerns and turned out useful work so quickly.

It's more like I started from "Let's design a message format that can be passed through shared memory or mmap()ed with literally zero copies", and then arena allocation was a natural requirement. :)

> Since your approach is much more focused, you have been able to turn out usable results orders of magnitude faster than I have.

To be fair, the fact that I'm working on it full-time -- and with no review, approval, or other management constraints of any kind -- helps a lot. :) (Down side is, no income...)

So, as the author of Cap'n Proto I'm biased -- though I'm also the author of Protobufs v2, so I'm not completely biased. :)

"Infinitely faster" is of course meant more to illustrate how Cap'n Proto works than to be taken as a literal speed measure. Although, if you actually wanted to compare Cap'n Proto to Protobufs, it's unclear what other number you can really come up with. The normal way to compare Protobuf speed vs. anything else is to measure the encode or decode step, but Cap'n Proto has no such step. You can measure an end-to-end system using one vs. the other, but then on the Cap'n Proto side you are basically measuring the speed of everything _except_ the Cap'n Proto code.

The git repo includes some contrived benchmarks along those lines which you can try out. I don't post the numbers because I'm not sure they are meaningful (even though they appear very favorable for Cap'n Proto). I'm really hoping to see a few unbiased third parties benchmark Cap'n Proto vs. Protobufs in real-world systems at some point.

Of course, the larger point here is that Cap'n Proto allows you to do things that Protobuf simply doesn't support, like mmap()ing in a large file and reading one field out of it in constant time, whereas with Protobuf you have to parse the whole thing making it O(size of file) time.

Thanks a lot for the reply, one of my favorite things about HN is getting questions answered by the authors of the tool in question. After reading more, your decision to not post those benchmarks is a smart one. I get where you're coming from with regards to it being hard to make a performance comparison to protobufs, it makes sense now. If/when I need to reach for some serialization I'll certainly try out Cap'n Proto.
> If/when I need to reach for some serialization I'll certainly try out Cap'n Proto.

If/when you do, remember that the mailing list is friendly and we very much want to hear your feedback and help you with any problems. :)

There's a big focus on how it's better than protobufs because it's written by the original creator of protobufs and is his attempt at making a better version based on what he's learned. By the time it hits 1.0 I suspect the only reason to use protobufs rather than it will be for backwards compatibility with existing systems.
> I have close to zero knowledge of the history here, so can someone comment on how many grains of salt this should be taken with?

Not many, as far as I know. Kenton worked on Protobufs at Google for years, so he should know exactly what he's doing here.