Hacker News new | ask | show | jobs
by gunnarmorling 1693 days ago
No matter how you integrate different applications, be it via APIs, messaging, or a database, it's vital to separate your application's internal data model from models which it exposes. If you don't do that, you're in for a never-ending story of upstream services unknowingly breaking downstream services, or upstream services not being able to evolve in any meaningful way.

So if they mean directly exposing a service's data model from the database to other services, I'm very skeptical. If they mean providing that access by means of some abstraction, e.g. database views, it can be an option in some cases.

You'll still loose lots of flexibility you'd gain by putting some middleware in between, e.g. ability to scale out compute independently from storage, ability to merge data from multiple sources into one API response, ability to implement arbitrarily complex business logic e.g. in Java, etc.

2 comments

> it's vital to separate your application's internal data model from models which it exposes

It really depends.

If you have many different client services that need to access the database in a similar way, then you are right, it makes sense to add some type of abstraction. Then you can change the underlying model and only need to update the common logic once.

Example: if two services both need to create user accounts, it makes sense to encapsulate the logic for creating user accounts somewhere. (A common practice is to use stored procedures inside the database for that)

But if the services access the database in very different ways, then your abstraction may end up just making things more complicated. You'll have just as much work to update as if you updated all the services individually.

Example: An internal dashboard may need to access the data in unique ways. If you route the dashboard service through some middle layer, then that middle layer would have lots of APIs that are used only by the dashboard. So you gain nothing from the abstraction. Any change to the dashboard requires updating the middle layer, any change to the database also requires updating the middle layer. It's just as much effort to make changes as if there was no middle layer, you've just split the dashboard logic in two parts and made it harder to understand.

I agree, but would you mind providing a concrete example of:

> No matter how you integrate different applications, be it via APIs, messaging, or a database, it's vital to separate your application's internal data model from models which it exposes.

Especially in respect to this product.

Are you asking for an example of separating the data models or an example of what happens when you don't.

This is a good example of what separation enables you to do:

https://www.troyhunt.com/your-api-versioning-is-wrong-which-...

(it is presented from the perspective of how to version an API, but the examples of what it looks like are there)

A field you rely on changed form integer to string. Or goes away. Or was unique but a duplicate shows up. Or the database disappears for a few minutes every hour.

S

Over time, the internal data model might need to change to more accurately model the world. For consistency and efficiency, we usually do not want to maintain the combination of the original low-fidelity model and new high-fidelity model within the internal database.

The internal data model might need to be reorganized to improve the efficiency of part of the internal application.

A client might not need a higher fidelity model. The internal reorganization of the data model might not be pertinent to the client's interface. So we normally have some data-mapping in the application to help provide a stable interface for clients.

It's possible to providing these compatibility mappings within the database through views, but this is usually considered to be harder to control, test and scale.

Maybe in a big application, the application-layer mapping and caching eventually get complicated enough to be something like a custom-made database. And so we might end up with an "integration database" but call it something different.

Great thanks.

So we have a DB and a service in front of it. The DB gets a new schema, but the service maps to the previous representation in such a way that clients do not need to know about the new schema. Is that the gist?

Then when it comes to using Fauna, what is it that does not allow such a data flow?

Interesting read, but seems like a tangential topic?