Hacker News new | ask | show | jobs
by sullenpaladin 3338 days ago
Abstracting the database has another benefit you are neglecting. It is easy to mock the database and test your application use cases if the database layer is abstracted. This is nearly impossible if database access is baked into the app.
3 comments

But do you really need to mock the database when you can just deploy a containerized one for testing?
We used a real database for testing. In the short term it was less work than architecting the application in a way that allowed the DB to be mocked.

Long term, the result was a 45-minute test suite which spent most of its time setting up records in the DB, and which could not be parallelized except by adding more instances of the real DB (otherwise the tests would interfere with one-another's assertions about the DB state).

but now with containers you can have a snapshot that's already set up in the right state. A separate build.
This is a powerful technique, though I'd say it's more useful for the build pipeline than for developers running UTs while coding (since the schema can change more frequently while doing so).

To provide a counterpoint though, an advantage touted by advocates of complete decoupling from the DB is that your UT suite can run in O(one minute), rather than O(ten minutes). E.g. see https://www.youtube.com/watch?v=tg5RFeSfBM4.

I'm not 100% sold on this approach yet (separating your domain objects entirely from the ORM wrapper is an uphill struggle), but it's interesting.

We run our tests on databases and it's on the order of one minute. Maybe it's a hardware issue? We all have beefy computers with fast SSDs.

Also since we strive to make our databases upgradable, it's important that the actual schema update scripts themselves are tested and used directly.

Design from the data first. You can build testing into your change management at the database layer. In my experience building applications this way reduces the chance for the database to enter into an invalid state.

When I build web applications on top of this they have less to do. They literally parse HTTP and shuttle data. No big MVC framework needed. When the data model changes we change the data model. In the database. And we use the abstraction facilities in our server to keep the public schema clean.

Premature abstraction is just as dangerous as premature optimization. Maybe even worse in my experience.

Another benefit is getting database stuff out of your domain code, makes solving the business problem a lot cleaner.

In fact i just write a domain model, solve the problem, then write database adapter to persist the domain model. I always think about modeling the domain first, then persistence is an after thought. The persistence adapter can be done however you want, raw sql orm, nosql. So it's not really "abstracted away", it's just no the focus of the application.

Everything just plugs into the domain model.

Your data is your domain. If you don't understand your data then you don't understand the problem. If there's a business rule for how that data is handled then it's most consistent to define it in the server that is purpose built to manage your data. Especially when it has built in facilities to constrain possible states.

I've seen too many programmer bugs to trust putting business logic outside of the DB. Separation of concerns here too just on different lines of concern.

> If you don't understand your data then you don't understand the problem.

That's a rather sweeping statement, that many in the industry do not agree with.

https://en.wikipedia.org/wiki/Domain-driven_design https://martinfowler.com/

> If there's a business rule for how that data is handled then it's most consistent to define it in the [db]

To be clear, you're proposing that all business rule validation should be implemented as stored procedures in the DB? One of my domain model aggregates consists of thousands of lines of code just enforcing business rules and constraints. Am I to put that all in the DB?

> That's a rather sweeping statement, that many in the industry do not agree with.

And some who do

https://dataorientedprogramming.wordpress.com/tag/mike-acton...

I'm not going to appeal to authority here. I'm speaking from experience. We all know how source code gets over time with hundreds of programmers working on it. A clear, consistent specification of your data model, rules, invariants, and transformations is far more valuable than the abstract-soup of trying to model your business domain in source code. I think we can all agree that the less code there is to understand then the easier it is to verify it is correct.

Verifying the requirement that "when record A is written to the database then B is appended with the delta change if such and such is True" is guaranteed at the database level along with all of the other constraints on those relations. If it's nested in one of these rings behind an abstract factory somewhere it's harder to verify.

> Am I to put that all in the DB?

That's where I would start. But I'm not you and I don't understand the problem you're trying to solve.

My original point was that abstracting out the platform if you're not really concerned about switching platforms is a form of premature pessimization and a source of errors. If you control the platform, target the platform and don't bother with the abstractions.

A domain can become more about behaviour, intergration and business proccess. Rather than pure data.

I find most SQL languages are not particularly great for general purpose programming. Especially tooling for testing.

I find relational algebra most useful for anything involving querying, validating, aggregating, and transforming data. SQL99 is by far one of the more useful implementations of it that I know about. And yes it does have limitations.

That's why most mature RDBMS servers ship with at least one procedural language. Though it'd be nice if there was an option to use OCaml or Haskell in PostgreSQL.

I'm not suggesting to throw out all your code and build your entire application in SQL. I'm just saying that if you control the database, use it, exploit it and don't abstract it out unless you absolutely have to (because you need to ship your application on-premises to clients who may run MySQL servers and others who run Oracle.

It's depends on your domain. However most domains are just as much about behaviour and data then just data.

Bugs are just as likely to occur in the database programming language as they are in application logic.

In fact i think tooling around unit testing is far more mature in general purpose languages.

Agreed, when's the last time you saw a debugger for a stored procedure? There aren't any. If you want to debug such a thing, you've got to write a bunch of print statements like it was 1982
I've seen unbelievable messes developed as sprawling stored procedures. And, no unit tests, since no one ever wants to test database logic, they just assume it's gold.
The same is true for the sprawling mess of Java interfaces and Factory patterns. Except now you have two sources of truth. Having worked with regulated clients and having to audit the entire stack it's much easier to to do with less code than more.

I've seen databases like that and similar teams that didn't take care with their change management.

Either way it's never pleasant to work with such systems.