| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dagss 1271 days ago

We do mini-batch processing in SQL. Some hundred milliseconds latency, some hundred events consumed from to the inbound event table per iteration. Paginate through using a (Shard, EventSequenceNumber) key; writers to table synchronize/lock so that this is safe.

Kafka-in-SQL if you wish. Or, homegrown Flink.

(There are many different uses for the events inside our SQL processing pipelines, and have to store the ingested events in SQL anyway)

I am sure real Kafka+Flink has some advantages, but...what we do works really well, is simple, and feels right for our scale.

It is enough batching in SQL to real speed/CPU benefits on inserts/updates into SQL (vs e.g. hitting SQL once per consumed event which would be way worse). And with Azure SQL the infra is extremely simple vs getting a Kafka cluster in our context.