Hacker News new | ask | show | jobs
by dagss 502 days ago
We did this style on top of plain MSSQL. Each event would have a SQL table which is the primary storage. Then we have workers that listens to new data in tables and updates projections we needed. (Sometimes DB triggers but mostly async workers.)

The main issue is "listening to new data in a SQL table". I wrote this code to achieve it in MSSQL (perhaps it is somehow built into postgres?): https://github.com/vippsas/mssql-changefeed

In my experience this approach is beautiful; as Martins says, our backend code is mostly stateless and functional these days, why have mutable objects in the DB? And the approach is extremely useful for dry-running business logic etc

But we didn't like the prospect of adopting Kafka wholesale. Having all the data in a SQL DB is extremely convenient for debugging, and since we already used SQL it was a smaller change that was done first where it made most sense and then spread out.

It would be great with more DB features targeting this style. Explicit partition event tables (kafka-in-SQL), and writing a projection simply as a SQL query which is inverted into an async trigger by the DB would be awesome. (MSSQL has indexed views, but it cannot be done online...)

Materialize is the DB I know about in this territory.

7 comments

I use the architecture you described as the go-to architecture for most of so called "business" applications (with PostgreSQL as dbms though).

The missing pieces are:

- incremental materialized view maintenance

- (bi)temporal primary and foreign keys (upcoming in Pg18)

Lately I found out about DBSP and Feldera which looks very promising as it is based on sound theory: https://github.com/feldera/feldera?tab=readme-ov-file#-theor...

Great link, thanks!

What do you do postgres for consuming old events and smoothly transitioning to consuming new events? Anything like an event ID allocated at commit time that is usable? As I talk about in sibling comment..

> perhaps it is somehow built into postgres?

Postgres has a built-in listen/notify mechanism. The problem with that is, that it doesn't guarantee delivery and if no process is listening, notifications will be lost.

Most solutions that need something like that use "logical decoding" these days. That's the built-in change data capture exposed as a public API as part of the logical replication.

Yes, listen/notify is something very different. We would often write new projections that consumes events from years back and until today.

You want sequence numbers that indicate the event's position in a partitioned log.

Something like "int identity" except that the int is assigned during commit, so that you have guarantee that if you see IDs 5 and 7, then 6 will never show up, so that each consumer can store a cursor of its progress of consuming the table which is safe against inserts.

I was hoping to do it using CDC, but Microsoft SQL has a minimum 1 minute delay on CDC which destroys any live data usecase. Perhaps postgres allows listening to the replication log with lower latency?

> Perhaps postgres allows listening to the replication log with lower latency?

Yes, I think that's what the "logical decoding" referred to. Postgres can emit a "logical" version of the WAL (something with a stable spec writtten down so that other services can stream and decode it). My understanding was that "logical replication" was designed for low latency situations like creating read replicas.

I haven't heard of the logical log being preserved for "years back" but that's an interesting case...

That is OK, guess I would write a job to listen to the logical WAL and use it to do an update that writes an event sequence number.
> Something like "int identity" except that the int is assigned during commit, so that you have guarantee that if you see IDs 5 and 7, then 6 will never show up

I don't think that's possible, nor is it something you should actually need.

If two transactions tx1 and tx2 are concurrent (let's say tx2 begins after tx1 and also finishes after tx1), then tx2 has done some work without access to tx1's data (as tx1 hadn't committed yet when tx2 began). So either:

- tx1's data is relevant to tx2, so tx2 needs to abort and retry. In which case the sequence number doesn't _need_ to be assigned at commit time, it can be assigned at any time and will be increasing monotonically between related transactions. - tx1's data is irrelevant to tx2, in which case the ordering is irrelevant and you don't need to assign the sequence number as late as commit-time.

The "relevance" is what partition keys encode: if tx1 and tx2 are potentially conflicting, they should use the same partition key. It doesn't enforce that sequence numbers increase monotonically within a physical partition, but it enforces that they do for a given _partition key_ (which is what should matter, the key->partition assignment is arbitrary).

> Perhaps postgres allows listening to the replication log with lower latency?

Pretty sure it does, you can listen to the WAL which is as instant as it gets. We were doing that in a previous company: a process (debezium) would listen to the WAL for a specific "events" table and write to Kafka. The main downside is that the table isn't an outbox, it keeps growing despite the events having been pushed to Kafka.

You explain why I don't need them for a very different usecase than what I refer to.

My point is I want a new primitive -- a pub/sub sequence number -- to avoid having Kafka around at all.

What Kafka does is "only" to generate and store such a sequence number after all (it orders events on a partition, but the sequence number I talk about is the same thing just different storage format). So also you do need it in the setup you describe, you just let Kafka generate it instead of having it in SQL.

Assuming your workload is fine with a single DB, the only thing Kafka gives you is in fact assigning such a post-commit row sequence number (+API/libraries building on it).

This is the mechanism used to implement pub/sub: Every consumer tracks what sequence number it has read to (and Kafka guarantees that the sequence number is increasing).

That is what mssql-changefeed linked above is about: Assigning those event log sequence numbers in the DB instead. And not use any event brokers (or outboxes) at all.

For postgres I would likely then consume the WAL and write sequence numbers to another table based on those...

It may seem clunky but IMO installing and operating Kafka just to get those pub/sub sequence numbers assigned is even clunkier.

It sounds like the Log Sequence Number in Postgres is what you are looking for. If you subscribe to a Postgres publication via logical replication, each commit will be emitted with a monotonically increasing LSN.
I'm not aware of any one minute minimum delay for CDC. We currently are running an on prem SQL Server -> CDC -> Debezium -> Azure Event Hub -> Azure Function App back to on prem SQL Server and that has 5-10 second delay from source system transaction commit till update/insert.
> The main issue is "listening to new data in a SQL table".

You may want to take a look at Service Broker[0]. It's the idiomatic messaging and queuing bit of SQL Server. It's a bit of an obscure feature and has a bit of a steep learning curve. If I were trying to implement what you're doing it would be the tool I'd reach for.

[0] https://learn.microsoft.com/en-us/sql/database-engine/config...

I would love to know if other people in the industry (beside hickey/datomic) use the immutable log/stream + integrators. From my small experience in enterprise app: auditability and time travelling are always bolted on good old sql tables/snapshots after the fact and the pain is already baked in.
Depends which industry. If you look at a lot of non-tech industry then they'll use a commercial DB with all those features in place already, rather than hacking up their own data layer. A few years ago I spent some time in the enterprise finance space, and learned some unfashionable tech you don't see talked about on Hacker News much. It left me with a new appreciation for what goes on there. A staggering amount of time spent in tech startups is spent on solving and resolving problems that you can buy off the shelf solutions for and have been able to for a long time.

After all, this talk is now 10 years old but appears to be describing features that have been around for much longer. Take your average bank - it will have a bunch of Oracle databases in it. Those already have every feature discussed in this thread and in the talk:

• Incremental materialized view maintenance (with automatic query rewrite to use it, so users don't have to know it exists).

• Exposing logical commit logs as an API, with tooling (e.g. GoldenGate, LogMiner, query change notifications).

• Time travelling SELECT (... AS OF).

• Lots of audit features.

• Integrated transactional and scalable MQ (no need for Kafka).

My experience was that faced with a data processing problem, enterprise devs will tend to just read the user guide for their corporation's database, or ask for advice from a graybeard who already did so. They go write some SQL or an Excel plugin or something old school, ship it, close the ticket, go home. Then a few years later you look at HN and find there's a whole startup trying to sell the same feature.

Who besides Oracle offers this stuff though?

Yeah Oracle has a bunch of nice features, it also costs a gajillion dollars that no one besides a large enterprise can afford.

Debezium is popular in this space, though it does bring more tools into the CDC/CQRS stack: https://debezium.io/
I got curious about this and took a look at some pricing calculators. The results are pretty counter-intuitive.

Compared to PostgreSQL that you run yourself, it's expensive because PostgreSQL you run yourself costs nothing if you assume your time is free. But, how many people want to run it themselves? Especially as Postgres isn't much fun to admin (fiddling with vacuuming, setting up replication by hand, managing major version upgrades etc and you may not be able to scale this way).

So in reality a lot of companies and especially startups these days pay Amazon to run the database for them, and so then the cost question is how much more does it cost for a cloud hosted Oracle DB vs a cloud hosted Postgres DB?

Well, an 8 vCPU hosted RDS Postgres in AWS with 32 GB of memory and 100 GB of storage plus another 200 GB of backup storage - so one less powerful than a local DB on my laptop - costs $1,200/month in US East. That's expensive! AWS doesn't let you scale CPU and RAM independently, so I tried to pick something in the middle. For only 100 GB of data you probably don't need 4 physical cores.

So then I checked the OCI (Oracle Cloud) price calculator and specced out a similar database. I picked autonomous serverless (i.e. fully managed), transaction processing+mixed, autoscaling with 8 ECPUs and same amount of primary/backup storage. They don't let you spec RAM independently, I guess because it's a shared DB so RAM usage is transient and not a VM allocation. The cost came to ~$800/month - that's significantly cheaper than RDS Postgres despite that Oracle DBs have drastically more features. Many of which are optimizations that can reduce your database load anyway, so presumably you need more Postgres cores to match the equivalent performance if those features are used smartly (honestly I haven't ported an app between postgres and oracle so I don't have experience with this).

This is pretty surprising. I'd have expected an Oracle DB to cost more, not less. Auto-scaling is part of it (cost is double RDS if you turn that off), but then again, this is possible because Oracle has more multi-tenant and resource isolation features to begin with so it's reasonable to share a database server and overcommit CPU. With AWS it's a full VM so you have to stop the db server manually if you want to save money. Also OCI is a cheaper cloud than AWS as it has less brand recognition I guess. This feels a bit like Amazon is exploiting people's mental defaults. Lots of devs think AWS and Postgres are the only cloud+db combination that is reasonable to consider, and apparently they charge on that basis?

I haven't specced out what a hosted bare metal cluster would cost. You can't cluster Postgres in the same way anyway (multi-write master with full SQL, no sharding).

time based snapshots are in datomic and also possible in other dbms (maybe via extensions)

for the rest i don't know

XTDB (inspired by datomic) also has bitemporal queries.
thanks I couldn't remember that name
Wouldn't surprise me a bit. Thanks for the detailed comment.
In large scale business intergration platforms/apps, you have operational systems like SAP and and Oracle Service Cloud generate/stream raw or business events which are published to message brokers in topics ( orders, incidents, suppliers, logistics, etc). There the data is published , validated, transformed (filtered, routed, formatted, enriched, aggregated, etc) into other downstream topics which can be used to egress to other apps or enterprise data stores/data lakes. Data governance apps control who has access. Elastic search or Splunk for data lineage and debugging. you also have sbservability systems sandwiched in there as well.
thoughts from 1992 (Gray+Reuter): https://news.ycombinator.com/item?id=42829878
very interesting, every generation foresees the same solutions somehow
I always thought that the most flexible approach was:

- good old mutable relational tables

- a separate db to store immutable events (could be the same kind of db you use for business transactions or something fancy like big query)

I feel like mixing both into one has more disadvantages

This is of course the more common approach, I was aware that our approach is not usual which is why I posted.

You don't list the disadvantages so cannot respond to that. But I really like the code resulting from flipping it around. It just fits how I think and I now feel "unsafe" messing around with code that mutates state directly.

The programming style of those mutable relational tables is a bit like mutable objects in Java -- eventually people moved to more stateless code and immutable objects. The same shift doesn't have to happen in the storage layer, but it is what the OP (and I) argue for.

I really enjoy having the tools around to replay any object from history as part of the online backend. For instance, consider if there was a bug so that customer input was buggy in some timeframe. Instead of writing a job that looks through the history of each customer and tries to figure out if you should mutate the data, then roll out that ad hoc mutation while holding your breath -- you can add some rules when fetching the relevant events on lookup-time and change how they reflect the state; i.e. a change that is only read only and only changes deployed code and not data, and roll back by only rolling back the service not by reverting a data change.

“ The main issue is "listening to new data in a SQL table". I wrote this code to achieve it in MSSQL (perhaps it is somehow built into postgres?): https://github.com/vippsas”

Postgres uses “publications” for this purpose. Clients can subscribe to a publication that gets updates to a given table.

I have implemented something similar (updating projection from background and serving the projection automatically via REST) and wrote a high-Level article about it https://www.fabianzeindl.com/posts/the-api-database-architec...