Hacker News new | ask | show | jobs
by srollyson 4667 days ago
Hi, Kenton.

First: thanks for your work on Protocol Buffers. I've used it fairly extensively for RPC communications between C++/Java clients and a Java service. It made things so much easier to get native objects in each language using a well-defined protocol.

One thing that bugged me about Protobuf is that it provided a skeletal mechanism for RPC (e.g. RpcController/RpcChannel) but later deprecated the use of that mechanism in favor of code-generating plugins. Since Cap'n Proto is billed as an "RPC system", do you have plans to include a more fleshed-out version of RPC than was provided in Protocol buffers? Having abstract classes for event-handling and transport mechanisms is a good idea for extensibility but it sure would make it easier for your users if there was at least one default implementation of each.

I imagine that Google has standard implementations of these things internally but balked at trying to support them for multiple languages as an open source project.

1 comments

Yes, in fact, the next release of Cap'n Proto (v0.4) is slated to include RPC support. There are some hints on what it might look like in the docs already:

http://kentonv.github.io/capnproto/rpc.html http://kentonv.github.io/capnproto/language.html#interfaces

The reason Google never released an RPC system together with protobufs is because Google's RPC implementation simply had too many dependencies on other Google infrastructure, and wasn't appropriate for use outside of Google datacenters. There were a few attempts to untangle the mess and produce something that could be released, but it never happened.

The public release had support for generating generic stubs, as you mentioned, but it was later decided that these stubs were actually a poor basis for implementing an RPC system. In their attempt to be generic, their interface ended up being rather awkward. We later decided that it made more sense to support code generator plugins, so that someone implementing an RPC system could provide a plugin that generates code ideal for that particular system. The generic interfaces were then deprecated.

Cap'n Proto also supports code generation plugins. But, as I said, we will soon also have an "official" RPC layer as well -- and it will hopefully be somewhat plugable itself, so that you can use a different underlying transport with the same generated interface code. Anyway, this will all become clearer with the next release, so stay tuned!

I'm not going to lie; it took me a little while to wrap my head around those stubs before implementing a TCP transport and semaphore triggers to unblock outstanding RPC function calls. However, it seemed much easier to do that than write a plugin for protoc to generate code that did roughly the same thing.

I'm currently considering RPC implementations for a personal project I'm working on. Right now I may end up trying Thrift since it seems to support RPC out of the box, but my ultimate goal is to have a WebSockets transport which Thrift doesn't provide. I may end up contributing to Cap'n Proto if it looks like the effort required to get RPC up and running has at least some parity with the effort required to extend Thrift for my needs.

It's clear from your planned use of futures and shared memory that your goal for Cap'n Proto is to make it the go-to library for communication in parallel computing. I'm definitely eager to see Cap'n Proto succeed in that endeavor. JSON is great for readability but it really isn't going to cut the cake when efficiency matters!

I hope you haven't forgotten to see if ICE might work for you, before you go looking at these "new" things (Not assuming you haven't, but it seems ICE fell a little out of hype from the day protobuf launched. I'm not convinced that fall from hype was entirely justified):

https://bitbucket.org/arco_group/example.ice-websocket

I look forward to hearing from you, should you decide to contribute. :) A web socket transport for Cap'n Proto would make a lot of sense, particularly if paired with a Javascript implementation, which one or two people have claimed they might create. I expect it will be easy to hook this in as a transport without disturbing much of the RPC implementation.
One random idea that just hit me if you're thinking about RPC layers anyways. Make sure that Cap'n Proto plays well with 0MQ. They probably do already, but a published example or two demonstrating it would not be a bad thing.
You can certainly send Cap'n Proto messages over 0MQ (or nanomsg) pretty easily -- Cap'n Proto gives you bytes, 0MQ takes bytes. Done deal.

However, supporting Cap'n Proto's planned RPC system on top of 0MQ may not work so well. The thing is, 0MQ implements specific interaction patterns, such as request/response, publish/subscribe, etc. Meanwhile, Cap'n Proto RPC is based on a different, more fundamental object-oriented model that doesn't fit into any of these patterns. A Cap'n Proto connection does not have a defined requester or responder -- both sides may hold any number of references to objects living on the other side, to which they can make requests at any time. So it fundamentally doesn't fit into the req/rep model, much less things like pub/sub. On the other hand, you can potentially build a pub/sub system on top of Cap'n Proto's model (as well as, trivially, a req/rep system).

I discussed this a bit on the mailing list:

https://groups.google.com/d/msg/capnproto/JYwBWX9eNqw/im5r_E...

At least, this is my understanding based on what I've managed to read so far of 0MQ's docs. I intend to investigate further, because it would be great to reuse existing work where it makes sense, but at the moment it isn't looking like a good fit. If I've missed something, definitely do let me know.

The killer feature that I like for 0MQ is that you can support message passing asynchronously, even when the other side is not currently up. For instance in a request/response pattern, one side might go away, get restarted, reinitialize, and then they carry on as if there wasn't a period in the middle where there was no connection. This kind of robust handling of network interruptions is very convenient for many use cases.

However what you describe isn't necessarily going to fit into that. The #1 thing that your description makes me wonder about is whether RPCs are going to be synchronous or asynchronous. So, for instance, if you hand me a data structure with a list objects that are references to data that I want to have, and I decide that I need 10 of them, do I have to pay for the overhead of 10 round trips, or can I say, "I need these 10" and get them all at once?