| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by boredandroid 1965 days ago

In my opinion this is pre-cloud thinking.

It used to be that distributed systems were a big trade off. They were operationally complex, they had limited apis (NoSQL), but they scaled. The best solution used to be to build things using a non-scalable but easy to use and run system, and then re-write it later if it needed to scale (often in a big hurry).

This is just not the case any more, though. Why? Two reasons:

1. We’ve gotten much better at distributed systems so the apis aren’t nearly as limited. It’s no longer that you either choose hard-to-use things like Hadoop/NoSQL or elegant but unscalable single-server databases. You can have both good abstractions and scale.

2. The cloud makes it possible to get systems as a service so there should be way less ops than running a single node system yourself

In the case of Kafka, I’m super biased as I’m one of the original authors, but I think the abstractions Kafka gives, stream processing capabilities, connectors, etc are just way better than a lot of the traditional solutions. Using something worse until you “need” Kafka might make sense on premise, but not in the cloud.

Confluent offers a Kafka service which is fully managed so you don’t do any of the upgrades, security patches, midnight pages, etc you just use the APIs. This is super affordable for the kind of simple apps the article describes. The price varies by cloud, but e.g. on GCP it starts at $0.11/GB for reads and $0.10/GB stored. That is a lot cheaper than using a single node system and then rebuilding everything if you need to scale, but not only that, it is also lower operational overhead (effective none) and a better interface/abstraction.

I think this isn’t unique to Kafka, either. There are great managed systems that are built to scale for most of the kinds of data systems you would use—-CockroachDB, Spanner, Aurora, Snowflake, Elasticsearch, Bigquery, etc.

Basically, you can have nice things now, just like the big tech companies.

1 comments

dan_quixote 1965 days ago

The problem I see with kafka is that it was built before cloud architectures were commonly adopted (with distributed systems everywhere). Confluent has put a lot of effort dragging kafka's architecture to the present, but some major features are missing:

  - Auto-scaling:  Confluent finally introduced "elastic scaling" a few months ago but it only allows you to scale up and must be triggered by the admin (no threshold-based auto-scaling).

  - Multi-tenancy:  Planning for a multi-tenant kafka cluster is not for the faint of heart.  Achieving isolation tends toward liberal usage of topics of which starts to become unmanageable in the low thousands.  This isn't crazy when you've got a few hundred microservices and several tenants to keep isolated.

  - Decoupled brokers and storage:  Any broker scaling or failure can lead to downtime while event storage is redistributed.

Confluent's Cloud service reduces operational overhead but isn't always feasible due to cost, resource limits (like service accounts or schemas for instance), data controls, etc.