Hacker News new | ask | show | jobs
by majodev 2172 days ago
We are using PostgreSQL templates to speed up our integration tests. This allow us to have full blown (and non mocked) databases which are isolated to each parallel running test.

We have recently open-sourced our core-server managing this (go): https://github.com/allaboutapps/integresql

Projects already utilizing Go and PostgreSQL may easily adapt this concept for their testing via https://github.com/allaboutapps/integresql-client-go

4 comments

We have found H2 in Postgres mode to be the fastest way to test db stuff, at least in Postgres and Java. In Go you might lose most of the speed advantage since its not running "in JVM". It also doesn't support many of pg's advanced features
I'm generally not a fan of emulating database behavior through a mocking layer while testing/implementing. Even minor version changes of PostgreSQL plus it's extensions (e.g. PostGIS) may introduce slight differences, e.g. how indices are used, function deprecations, query planner, etc. . It might not even be an erroneous result, just performance regressions or slight sorting differences in the returned query result.

We try to approximate local/test and live as close as possible, therefore using the same database, with the same extensions in their exact same version is a hard requirement for us while implementing/testing locally.

However, I have no experience with H2 (and the modern Java ecosystem in general), so cannot talk about how close their emulation implementation resembles PostgreSQL behavior.

We do the same thing, with the addendum that nearly all of our tests use this mechanism.

A full test suite run recreates the template db from flyway migrations; every test run clones the template db. One nice thing about this is that it's easy to examine the state of the database on test failures. One downside is that there's no "drop multiple databases" command so cleanup can take a while depending on how often you run the script.

Yep, inspecting the actual database state after multiple test failures directly is very nice (we log the used test-database after before every test, so it's easy to connect afterwards).

Regarding cleanup: We run configure a max pool size in integresql after which "dirty"-flagged test databases are automatically removed (and then recreated on demand). We may also support auto-deletion (e.g. after a successful test) in the future, however currently it just does not seem necessary even when we have 5000+ test-db, it's just a disk-space concern.

This is super nice! I have seen similar solutions for Python in the past, but having an easy API + Docker makes this generic and language agnostics.

Have you looked to tweak PSQL parameters to make it faster for test runs e.g. by disabling disk writes?

Yep, disabling fsync, synchronous_commit and full_page_writes ONLY for this PostgreSQL integration testing strategy gives us an additional boost of ~35% in overall testing speed. https://github.com/allaboutapps/go-starter/blob/master/docke...

Regarding being language agnostic: YES, clients for languages other than go are very welcome. The integresql server actually provides a trivial RESTful JSON api and should be fairly easy to integrate with any language. This is our second iteration regarding this PostgreSQL test strategy, the last version came embedded with our legacy Node.js backend stack and wasn't able to handle tests in parallel, which became an additional hard requirement for us.

See our go-starter project for infos on how we are using integresql, e.g. our WithTestDatabase/WithTestServer utility function we use in all tests to "inject" the specific database: https://github.com/allaboutapps/go-starter/blob/master/inter...

How does this compare (performance-wise) to simply truncating all tables?
We did several benchmarks back in the day and it definitely depends on the size of your migrations + fixtures that you need in every test. Performance increases were simply huge for us, despite the serial execution of our tests in our legacy backends.

A new database scaffolded from a predefined template becomes available in several milliseconds or few seconds VS. truncating, migrating up and seeding (fixtures) would always take several seconds. We initially experimented with recreating these test-databases directly via sql dumps, then binary dumps until we finally realized that PostgreSQL templates are the best option in these testing scenarios.

Sorry, I've got no benchmarks to share at this time.