Hacker News new | ask | show | jobs
by cryptonector 2977 days ago
I've written an HTTP service that offers "tail -f" as a service. When you GET a resource with a Range: bytes=0- (or some other starting offset with unspecified end offset) you get chunked transfer-encoding that doesn't terminate until the file is unlinked or renamed out of the way.

This is incredibly handy, both for the usual things one might want to tail -f (log files), and as a cheap-but-very-functional pub/sub system.

One thing I've learned is to watch out for silly proxies (client or reverse) that want to read the whole thing and then re-serve it as definite-length (i.e., not chunked, with Content-Length) or which impose a timeout on the transfer.

HTTP needs a way to say that a chunked encoding is of indefinite length, though arguably the Range: header in my case ought to be all the hint the proxies (and libraries!) need.

3 comments

Isn’t the chunk the unit, why should it be unbounded m? You’d have issues on the client side without knowing how long a chunk is going to be - It seems like the chunk should be the unit which can be guaranteed to fit in memory on the client.
Chunked transfer encoding means that the sender sends <length>\r\n<chunk-of-that-length>\r\n... but each chunk can be of different length. A zero-length chunk terminates the transfer.

Chunked transfer encoding is used when the sender doesn't yet know the length of the resource's representation.

You could consider using websockets, which are intended to be used the way you're trying to use chunking.
I'm aware. But this is easier to use. For example, here's how you tail some log file: curl -H 'Range: bytes=0-' https://foo.example/bar.log.
Is this OSS? Is it available for use anywhere?
Not yet. I've asked permission to open source it. It's basically epoll-based, evented, and it's darned simple.