Hacker News new | ask | show | jobs
by nrr 534 days ago
I think it's probably worth mentioning that the principal concern for tests should be proving out the application's logic, and unless you're really leaning on your database to be, e.g., a source of type and invariant enforcement for your data, any sort of database-specific testing can be deferred to integration and UAT.

I use both the mocked and real database approaches illustrated here because they ultimately focus on different things: the mocked approach validates that the model is internally consistent with itself, and the real database approach validates that the same model is externally consistent with the real world.

It may seem like a duplication of effort to do that, but tests are where you really should Write Everything Twice in a world where it's expected that you Don't Repeat Yourself.

1 comments

The database is often the thing that enforces the most critical application invariants, and is the primary source of errors when those invariants are violated. For example, "tenant IDs are unique" or "updates to the foobars are strictly serializable". The only thing enforcing these invariants in production is the interplay between your database schema and the queries you execute against it. So unless you exercise these invariants and the error cases against the actual database (or a lightweight containerized version thereof) in your test suite, it's your users who are actually testing the critical invariants.

I'm pretty sure "don't repeat yourself" thinking has led to the vast majority of the bad ideas I've seen so far in my career. It's a truly crippling brainworm, and I wish computer schools wouldn't teach it.

> The only thing enforcing these invariants in production is the interplay between your database schema and the queries you execute against it."

I'm unsure that I agree. The two examples you gave, establishing that IDs are unique and that updates to entities in the system are serializable (and linearizable while we're here), are plenty doable without having to touch the real database. (In fact, as far as the former is concerned, this dual approach to testing is what made me adopt having a wholly separate "service"[0] in my applications for doling out IDs to things. I used to work in a big Kafka shop that you've almost certainly heard of, and they taught me how to deal with the latter.)

That said, I'd never advocate for just relying on one approach over the other. Do both. Absolutely do both.

> I'm pretty sure "don't repeat yourself" thinking has led to the vast majority of the bad ideas I've seen so far in my career. It's a truly crippling brainworm, and I wish computer schools wouldn't teach it.

I brought up WET mostly to comment that, if there's one place in software development where copying and pasting is to be encouraged, testing is it. I'd like to shelve the WET vs. DRY debate as firmly out of scope for this thread if that's alright.

0: It's a service inasmuch as an instance of a class implementing an interface can be a service, but it opens up the possibility of more easily refactoring to cross over into running against multiple databases later.

I've often been tempted to make an "id service" also because you can potentially get compact integer ids that are globally unique. That'll likely save you more than a factor of 2 in your ID fields given varint encoding, which could be very significant in overall throughput depending on what your data look like. Never actually tried it IRL though.

I agree both approaches are important, and it's totally ok if they overlap. If your unit tests have some overlap on your integration tests, that's nbd especially seeing as you can run your unit tests in parallel.

EDIT: actually I'll make a much bolder claim: even if your unit tests are making flawed assumptions about the underlying dependencies, it's still pretty much fine so long as you also exercise those dependencies in integration tests. That is, even somewhat bit-rotted unit tests with flawed mocks and assertions are still valuable because they exercise the code. More shots on goal is a great thing even if they're not 100% reliable.

> If your unit tests have some overlap on your integration tests, that's nbd especially seeing as you can run your unit tests in parallel.

Exactly.

Another upside I've run into while doing things this way is that it gets me out of being relational database-brained. Sometimes, you really do not need the full-blown relational data model when a big blob of JSON will work just fine.

There's something very, very wrong in the way we write programs nowadays.

Because yeah, the database is you main source of invariants. But there is no good reason for you application environment not to query the invariants from there and test or prove your code around them.

We do DRY very badly, and the most vocal proponents are the worst... But I don't think this is a good example of the principle failing.

> There's something very, very wrong in the way we write programs nowadays.

I largely agree, but...

> ... the database is you main source of invariants.

I guess my upbringing through strict typing discipline leaves me questioning this in particular. I'm able to encode these things in my types without consulting my database at build time and statically verify that my data are as they should be as they traverse my system with not really any extra ceremony.

Encoding that in the database is nice (and necessary), but in the interest of limiting network round-trips (particularly in our cloud-oriented world), I really would prefer that my app can get its act together first before crossing the machine boundary.

> no good reason for you application environment not to query the invariants from there and test or prove your code around them

As a developer who primarily builds backend web applications in high level languages like golang and java I run the risk of sounding ignorant talking like this but.. I'm led to believe lower level systems and embedded software has a lot more invariant preserving runtime asserts and such in it. The idea being that if an invariant is violated better to fail hard and fast than to attempt to proceed as if everything is alright.

Hum... I'm not sure we are talking about the same thing. Of course system and embedded software won't have invariants stored in a database, the comment isn't about them.

But, there isn't a faster way to fail to an invariant than to prove statically that your code fails it, or to test it before deploying. I don't really understand your criticism.