Hacker News new | ask | show | jobs
by bennyelv 1007 days ago
Absolutely correct, listen to this article's ideas with great scepticism!

The system that I'm currently responsible for made this exact decision. The database is the API, and all the consuming services dip directly into each other's data. This is all within one system with one organisation in charge, and it's an unmanageable mess. The pattern suggested here is exactly the same, but with each of the consuming services owned by different organisations, so it will only be worse.

Change in a software system is inevitable, and in order to safety manage change you require a level of abstraction between inside a domain and outside and a strictly defined API contract with the outside that you can version control.

Could you create this with a layer of stored procedures on top of database replicas as described here? Theoretically yes, but in practice no. In exactly the same way that you can theoretically service any car with only a set of mole-grips.

1 comments

This is just an interface, and you have the same problems with versioning and compatibility as you do with any interface. There's no difference here between the schema/semantics of a table and the types/semantics of an API.

IME what data pipelines do is they implement versioning with namespaces/schemas/versioned tables. Clients are then free to use whatever version they like. You then have the same policy of support/maintenance as you would for any software package or API.

> There's no difference here between the schema/semantics of a table and the types/semantics of an API.

There is a big difference. The types of an API can be changed independently of your schema.

You're looking at the wrong layer. If we were to go to the layer you're talking about, we'd have internal and external tables where we could change the structure of the internal tables, and the rebuild/rematerialize the external tables/views from the internal ones.
If the external tables are views that can combine select columns from multiple tables with computed fields - maybe. In theory it’s good, in practice I’ve never seen it done well.
I do think tools to manage this stuff... basically don't exist, so I'm sympathetic to the argument that while there's mostly equivalency between data and software stacks, software stacks are way more on the rails than data stacks are. Which is to say, I have seen this stuff work well with experienced data engineers, but I think you need more experience to get the same success on the data side than you do on the software side.
Yeah, I could see that. It’s not common and the tooling is primitive. Same thing I would say about event sourcing. Great in theory, but it’s more likely to get your average team into trouble.