Hacker News new | ask | show | jobs
by tristanz 4038 days ago
Martin's talks and blog posts are aways awesome. I'm really excited to see how this plays out for real applications.

The one thing I'm always somewhat confused by though is how a "totally ordered log" intersects with the reality of a partitioned log. The simplicity of a log seems to break down a bit when you partition.

For instance, imagine I want to implement multi-key transactions on top of a distributed datastore. With a totally ordered log this is easy. But with a partitioned log, it becomes much harder.

Alternatively, imagine I want to implement a collaborative editing app like Google Docks or something like Slack. A natural design would be to to have millions of independent logs. I can then replay logs to get current state and watch logs to keep it updated. But as far as I'm aware, partitioned logs like Kafka do not actually support millions of topics. So there's no way to replay a log for something like a channel or document.

2 comments

One thing to realize is that a partitioned log is a generalization of an unpartitioned log (i.e. if you set # partitions = 1 in a partitioned log you have an unpartitioned log).

In Kafka the purpose of partitions is to provide computational parallelism not model entities in the world. So if you have 100m users you would map that into a number of partitions based on your computational parallelism (maybe 10-100 machines/processes/threads). In other words you would have a single topic partitioned by user id, not a topic per user.

If you have a centralized relational database that maps reasonably well to a single partition log (both in terms of scalability and guarantees).

For distributed databases you generally don't have a total order over all operations. What you usually have is (at best) a per partition ordering, which maps well to a partitioned log as well.

For applications that record events (logging or whatever) it is natural to think of each application thread or process as a kind of actor with a total order.

Yeah, this is a good way to look at it. But it's also my point. Partitions are about parallelism and don't always fit the data model or domain. While you can reduce the partitions to 1, this is limits parallelism. It's not always easy to design a partitioning scheme that preserves the replication semantics you want. And these semantics vary from having something totally ordered, to having something that you can replay on a very fine-grained level (to replicate to the client for instance.) Most discussions of the advantages of logs really emphasize how amazing a totally ordered log is for replication, but that's not actually what production deployments look like so you still need to think carefully about what happens when writes are being applied to your datastores without a clear order.
What Kafka gets right is that it explicitly addresses the fact that if you add parallelism, order goes out of the window (modulo partition key).

This is something that tends to surprise developers early on (myself included, years ago). But plenty of people still use queue solutions like RabbitMQ without thinking it all the way through.

Unfortunately, partitioning introduces a design step that makes it a little harder to make processing generic. With RabbitMQ you just post to an exchange and let queues (ie., consumers) filter on the routing keys; if no queues have been bound, for example, messages don't go anywhere. If you want, or don't want, parallelism, you just run either multiple consumers or just one. With Kafka, you need to decide beforehand, and design the "topology" of your log carefully, not just for the producer, but for each consumer. When producers and consumers are different apps, this starts smelling like a violation of the principle of "separation of concerns".

I rather wish Kafka had a better routing mechanism, actually. I don't see any reason why it couldn't have routing keys, just like RabbitMQ.

You could always map the document id or channel id to be always the same partition, then that particular document or channel log would be ordered. Seems like it would work.
The problem with this is there is no way to replay the log when clients connect without replaying the whole topic, which can be huge.
True, hadn't thought about that.