Hacker News new | ask | show | jobs
by tfussell 1884 days ago
What if your test includes a transaction already?
5 comments

We make tests robust in the face of existing data. For example, each name is "GUID-ified", we only compare specific subsets of rows that we know will be isolated from the rest etc...

We still use transaction rollback for most of our tests, but all tests are written with the assumption of existing data, to accommodate for the few tests that must commit their transactions (such as concurrency tests).

This is the way to go imo!

It also enables parallelization of test runs :)

And it also makes testing databases with limited transaction APIs (like DynamoDB) much easier.

One approach we used was to wrap the connections in a proxy (easy in Python), and flag if the code called the commit() method at any point. You do need to trust that the code doesn't call COMMIT explicitly via the execute() method, and flag tests running subprocesses that might dirty the database. If the test is dirty, rebuild the test database from the template instead of relying on rolling back the transactions.

However, if you add this behavior to an existing test suite you will likely have to fix a lot of tests assuming sequences will return predictable numbers, such as in automatically generated primary keys. Technically, these tests are already broken even if it is unlikely to see them fail.

Typically the approach I take is monkey patch out the transaction methods (begin, commit, rollback) in the test harnesses and wrap each test in a real transaction. Some test runners have this built in.

This is easy in dynamic languages, really hard in in static languages.

So you‘re testing against a database some process that you expect to fail ond you would like to test whether the rollback is done properly.

And because you abstracted all your transaction logic away for the tests, your test result is not worth anything.

IMHO, there are only 2 ways to achieve proper database testing: 1) new db for each test (very slow) or delete all data from all tables before the test (ok for smaller projects) 2) tests do absolutely have no impact on other tests with the data they added/modified/dropped (very hard to achieve).

Edit: typo

I actually usually do something compatible with that for rollbacks, or I special case those tests.

It doesn't have to work in all corner cases, just the ones that matter to me.

The advantage of doing it this way is the tests run very fast - which is critical if you want developers to use them often.

Couldnt you just patch the transaction logic to savepoints? So even tests building on transactional behaviour would work correctly.
I seem to recall savepoints differ in some ways that can make this difficult - but without remembering the specific problem I hesitate to say anything concrete about that.
You could use savepoints instead to wrap the transaction.
Some ORMs do that automatically as they detect nesting transaction attempts.
Not sure how Django handles exactly it but it works really well transparently. The only issue is when the code you are testing rolls back a nested transaction, I seem to remember that caused issues but there is a workaround at least