| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Everlag 2171 days ago

I am curious how many people mock their SQL queries and how they decide to mock vs not mock at that level.

From the perspective of the services I'm knee deep in at the moment, SQL feels like the wrong layer to mock at. Instead, when we do want to mock, we usually pass around mocked out data access objects. That way we work at the application layer of `GetN` rather than `SELECT a, b, c FROM N`.

It feels like a lot of the value of the tests we write that do execute sql is a result of them actually running it. Granted, it is slower than mocking, requires synchronous rather than parallel tests, and we need to wipe the impacted tables for every test. However, the value we get is that we know that a query actually has the intended semantic.

6 comments

yashap 2171 days ago

In unit tests, I like to only test a single method in isolation, with all of it’s dependencies mocked. So in a domain layer method, that depends on some infrastructure layer code that itself does the DB access, I mock the DB access layer. And the DB access layer already hides the SQL, so you’re mocking “normal” objects/methods, not SQL.

However, when unit testing the DB access code itself, I like to connect to a real-but-embedded database (for example with JVM projects, these guys have some great embedded DB libs: https://github.com/flapdoodle-oss), with a different DB per test class (different test classes run concurrently), but no cleanup between test cases (which run sequentially). This lets me test the SQL queries themselves.

Integration tests are a different story, they run against actual deployed versions of services, with real everything (including databases), nothing is mocked.

link

jeltz 2171 days ago

Typically there is no issue with running parallel tests against a real database. I just wrap every test case in a transaction and turn the transactions into subtransactions. This way you can just rollback at the end of each test.

link

ramraj07 2171 days ago

We typically have multiple options in our tests - some SQL logic is testable in sqlite itself, while others (crap like arrays and datediffs) we try to run on an actual postgres instance. Snowflake has been the most annoying, since it doesn't support transactions and is also relatively slow anyways.

link

controversy 2171 days ago

I agree with you. SQL needs to be integration tests. In this era of Docker spinning up a DB should be easy.

link

xyzzy_plugh 2171 days ago

Yep, this is the pattern I usually use. It's useful to put tests that hit a real DB behind a build flag, for situations where a DB might not be present. Also since I prefer to bake migrations into my application, those get thoroughly tested as well.

link

sverhagen 2171 days ago

We require that CI builds are self-contained. So if you need a database, run it. Others mentioned Docker. Specifically there is Testcontainers. If you need cross-service integration, those don't belong in your CI builds. (So the tech/setup can also be a bit different.)

link

rmetzler 2171 days ago

I may be naive, but do people with slow DB tests put the DB in a tmpfs to improve performance?

link

tarjei 2171 days ago

We do this all the time. Our tests run against SQLite (I use Mariadb in prod).

Each parallel testrun gets its own path in /dev/shm/xx so everything is in ram as well.

link

ptx 2171 days ago

If you're using SQLite, why the indirection of an in-memory filesystem rather than SQLite's in-memory database feature?

link

yencabulator 2171 days ago

Not the person you're asking this, but N connections to sqlite :memory: backend results in N separate database instances.

This can be worked around in some scenarios, but not always & reliably. Most importantly, you can't reach the same in-memory database from multiple processes.

https://www.sqlite.org/inmemorydb.html

link

mixedCase 2171 days ago

I have used libeatmydata to help with this, but you just made me realize that in my current project I could just dump the test db to RAM. Thanks for the tip!

link

sverhagen 2171 days ago

For integration testing I'm sure people do all sorts of "interesting" things. For unit tests in an automated CI build, it is often more desirable to have a smaller subset of data, which presumably is less slow, since tests there also focus on more isolated workflows.

link