Hacker News new | ask | show | jobs
by unoti 4592 days ago
> Length-prefixed message framing winds up being 10-100 lines of code in almost any language/environment.

Over the years, I've worked on several different systems that wrote those "10-100" lines of message framing from scratch, and had to fix subtle, hard-to-track-down bugs with those. It's a conceptually simple thing that's very easy to have subtle bugs in edge cases. Edge cases that are difficult or impossible to produce in development systems, that do happen in production systems once you're running high volumes. An example of this is sockets pausing and in the middle of sending the multi-byte length, and needing to fiddle with certain parameters on the sockets that control heartbeat and other minutia.

It's certainly simple to write something that works well, but also very simple to write something that works well but will fail in subtle ways under certain kinds of circumstances.

2 comments

With (dirt simple) length-prefix framing, and blocking reads, pauses and such are non-issues. e.g. blocking read for 2 length bytes, blocking read for N bytes of payload. If your read fails for any reason, or you've timed out, then you give up and close the connection.

Your application protocol needs to handle timeouts (some sort of retry, preferably with some notion of idempotency).

The problem with:

> > Length-prefixed message framing winds up being 10-100 lines of code in almost any language/environment.

is that it's not really a response the ZMQ feature. With ZMQ you don't have to reimplement for every application and platform.

Can you elaborate on why pausing on sending is particularly tricky when sending a multi-byte length header compared to say pauses on any other part of the payload?
It's particularly tricky when dealing with async code, where you can't simply say "block here till you have 4 bytes." If you're just getting in events that say "you received n bytes and here they are."
All networking code should work even if it receives 1 byte at a time. Use a buffer, and have some sort of abstraction responsible for packetizing the input. The output of that module is a fully formed frame ready for interpretation.