Hacker News new | ask | show | jobs
by SMFloris 1922 days ago
Recently I started delving into gRPC vs RPC over RabbitMq using json as the mesage format. I saw that for small to medium sized messages, gRPC is actually slower. Of course, this was just a small scale experiment so I don't think nothing of it.

Does anyone have a sort of infrastructure/architecture guide at a bigger scale for gRPC?

My biggest questions range from: How do you actually load balance the servers? What happens if you have a sudden influx of requests but don't want to auto-scale? Do you still need a sort of queue-ing system in front of the gRPC server?

In my research I wasn't able to find some noteworthy articles about this and thus triggered my curiosity.

4 comments

I don't know if there is a unified playbook, but there are random blog posts. One that comes to mind (disclosure: I worked on the system, mostly after the blog entry) is https://dropbox.tech/infrastructure/courier-dropbox-migratio... (and perhaps https://dropbox.tech/infrastructure/how-we-migrated-dropbox-...)

I suppose the problem with an easy go-to playbook is if that were easy to pick one solution for every problem, gRPC would have likely picked it by default (I can attest the team, and the external contributors, are top notch, having worked alongside them). Unfortunately some problems, at scale, need to be answered depending on architecture with the holistic system in mind, and are not just gRPC issues. Truth is large scale systems are hard to build and operate. Perhaps that is why you get help from experienced individuals and consultants.

I do agree some documentation/tooling is lacking and could be improved to guide folks through the process.

Thank you very much for these articles. Really top-notch work and provided me with some great insights!
I found gRPC not particularly fast on a per-message basis unless you're using the streaming feature. If you have iterated calls, consider streaming instead.

Because it supports full duplex streaming, there's a risk of tunneling your own less than fully specified protocol on top of gRPC. In some circumstances that may be worth taking advantage of, because gRPC takes care of session management, reconnecting, authorization (i.e. it has ways for you to add authorization, like headers) etc.

If you need queuing I think you should use a queue instead.

Very interesting. Did you also try Protobuf or MsgPack / SMILE over RabbitMq?
Never tried with MsgPack, but I did try Protobuf. The reason I dislike Protobuf is that there is an extra code generation step I need to do. When using a compiled language, I guess you don't really mind the extra code generation step since you also get some safety from the compiler so you don't wind up miss-using the generated code. It is not the same when you are using an interpreted language that doesn't have really good strong typing. You will have to run a static code analyser and be very careful every time you do a change to the interface.

Since in my experiments I was calling a method in Php through RabbitMq from a Rust worker, so it just proved allot simpler to just use json. Also, I measured the time it took to:

1. Make a request to the Php API

2. Php sends the rpc message on the queue

3. Rust processes that message

4. Php catches the response from the worker

5. Php returns the response to the http client

It was <10ms running the cluster locally regardless if I used Protobuf or simply json.

The protobuf vs msgpack benchmarks are not too bad. Msgpack performs very decently.

https://github.com/alecthomas/go_serialization_benchmarks

One observation / warning I have WRT msgpack is that I have seen situations where data getting serialized directly to msgpack can't be rendered as proper json. Specifically JSON requires some proper encoding of characters while msgpack is perfectly happy to convey some binary characters.

The specific situation I saw was 1) antique perl application barfs a sql dump into msgpack 2) fluentd takes that, turns it into a thing like json but with a control character in it (think old time cyan blinky happy face or \0x03) 3) that thing gets dumped into kafka 4) logstash pulls that thing out of kafka and tries to feed it to elasticsearch 5) elasticsearch reports that it doesn't like the blinky happy face, generates an extensive log, and stores the event minus the key/value pair with the blinky happy face.

Certainly many of these steps have an implicit "don't do that!" or "Update to a newer thinger!" but the root cause (perl serializer into msgpack renders a thing that's valid msgpack but not valid json) is surprising in an unpleasant way.

You didn't define small or medium size and number of messages. From my experience server stream while great for memory and giving the client time to process messages as they come can be slower than unary. One way to solve that is to have a batched message with a repeated field of the underline message you want to send, it's an order faster.

I read sometime ago about about server pushback, maybe you need to override/implement your own StreamObserver to notify the client to slow down as the server isn't ready yet or configure the amount of data the gRPC server can queue there are quite a few configurations (a lot of them obscured) you can make to the server/grpc service.

Seems so much work for something already implemented with messaging brokers, though.

In my opinion, clients themselves shouldn't worry about if the server is ready or not, only handle if the server does not respond in x seconds and then simply crash or error out. It is the same you would to with any other external service call.

I think the biggest change to gRPC is to the way I thought. Dumb clients was always something that I chose because it was an order of magnitude simpler to reason about. gRPC comes in and changes this by making the clients smart and the servers smarter, which brings allot of complexity to the table.

Also, indeed the configurations are so obscure.

It's just a different framework for a different use case. The fact that it's http 2 and support bi-directional streams is nice for low latency applications.

I doesn't make RPC over message brokers deprecated and if that works for you under you conditions that's great.

You can say the same about message brokers. Is it LIFO, FIFO or something random? what about acknowledge, is it late or not? Can the queue hold responses (Rabbit advices against using it as a result store)?

In that regard message brokers are complicated, lets just do http calls and let the load balancer take care of routing, it's much simpler and the client is very dumb.