Hacker News new | ask | show | jobs
by sago 4085 days ago
You're making me feel very dumb. Because several of these seem to be the opposite of what I've observed.

Testing with a mock object implies that the mock object can generate all the required output that the real object can generate that might have some effect on the consuming code. Not only that, but it assumes that the mock object generates the correct data in ways that cannot generate false positives in the test. This doesn't mean you're only testing the client logic. You're now testing the client logic using services that are ad-hoc and aren't guaranteed to behave like the real thing. You're testing a fantasy.

It is far better to test against the real database. Using a fixture, or a transaction, or some way to use the actual system with representative data. Mocks have their place in very complex services where this is practically impossible. But they don't suddenly make things better for testing, or more atomic. IMHO, when you have to use a mock, it should be as a last resort, when you have to sacrifice fidelity for tractability. Your code is coupled in behavior to the services it uses, pretending it isn't is just fooling yourself.

I have very much the same problem with people who write unit tests against, say SQLite databases, rather than the full DBMS. The complexity of 'masking sure the database is connected and can be queried' is pretty trivial compared to the complexity of mocking a whole RBMDS interface. Good software engineering will, of course, limit the number of places the database interfaces with (I'm not suggesting code with SQL statements in strings everywhere, that's a straw man). But I'd not accept mocked tests that exists just to avoid a database connection or because the developer doesn't understand how to write a transaction.

So I don't understand. Either you're advocating a very bizarre, and seemingly pathological development style, or you're consistently muddying the waters by comparing good programming in your chosen methodology with bad programming in mine, which just misses the point.

Here's an example then. In your Person object, on a platform with reasonable transaction/fixtures support (like Django). Is it better to write your unit test using a mocked ORM layer, or a fixture with the test data in it?

> He is missing perhaps the main benefit of unit tests - which is that, when a bug arises, you can quickly eliminate many possible causes because unit tests against those parts of code have succeeded

I've no idea why this is somehow impossible. I write unit tests at various levels of abstraction. If I have module A, calling module B which calls module C, then I need tests for C, B(+C) and A(+B+C). If I get a failure in A, I make sure that there is a test in B that corresponds to the way A is using B, if so, it is a problem with A, not B. If B and C were mocked, I'd have no way of knowing if the problem was with the mock logic without having to test C, C-mock, B+C-mock, B-mock, A+B-mock.

> now has a dependency on all of the code which mutates it

This seems a bizarre claim. Does your code have a dependency on everything else that can possibly change what's on the screen? If so, how do you deal with that?

That's why pretending 'global variables' = 'all central resources' seems foolish to me.

1 comments

I probably have quite a fundamentalist view on unit testing because I write primarily in purely functional code these days - where a "unit" is a pure function, and it's clearly an isolated unit. Even when I'm back in OOP world though, I basically avoid static variables/globals like the plague. Even where the framework or some library makes use of them, I'll tend to wrap them up and pass them into my code via Main, to make sure that no statics are globally accessible throughout the code.

If I were testing a salary calculation which takes values from a database, and I named my test "Test_salary_calculation_correct", where instead of using some sample data which could easily cover the range of values I need to test against, I instead relied on a database connection, and this test failed because the database was not accessible - I've only confused the developer who picks up my shit where "Test_salary_calculation_correct" fails, and he thinks there's a problem with my calculation rather than a misconfigured firewall somewhere else. The firewall has nothing to do with my salary calcuation - why should it have any effect on the test passing?

The way I see unit tests is this: If you write a test and it passes on your machine, then some other developer takes your code and the same test fails - it's a fuckup on your behalf. Unit tests should not depend on the environment in any way. Actually, by definition, a unit test is a test of a single "unit" - including database access into this is well beyond the scope of unit testing, but into integration testing.

To me it seems you're skipping unit testing and just going onto integration testing with your unit testing framework. I'm not sure what you've observed or where, but I can tell you it's certainly not standard or best practice in the industry. It might possibly tell you something about your own code style though - are you writing units which can be treated in isolation? (Certainly not if you depend on a SL, which is a global context of services with no clear boundary)

Ideally a codebase should be designed to maximize unit-testability and reduce the need for integration testing to as little as possible - since this is where most of the "unexpected", or "out of my control" problems are most likely to occur. This testing is more a case of "am I handling all the relevant exceptions" than getting green lights to pass in a unit testing framework. It doesn't really help to make unit tests against code which is expected to fail out in the wild due to whatever circumstance - what matters here is that your code is prepared for the worst and knows how to recover.

It's these cases where mock classes are particularly useful - because you can forcefully simluate any behavior from the external service and make sure your code is working correctly for all the potential circumstances. Having to rely on divine intervention to trigger some event that may only happen 1% of the time in the real-world situation is hardly practical. Unfortunately testing in the wild is often like this - everything works fine 99% of the time.

Even for cases where you're arguing for a fixture with real test data in (from a database), then the reasonable thing to do is extract this data beforehand and encode it into the unit testing language (which is fairly trivial to do). Now you have a reliable test which will continue to work as you update the code. Testing against live data is giving a false sense of security to begin with anyway. Imagine the scenario where you have a bunch of data in the database, you run your unit test against it with all green flags - then after deployment, somebody inserts into the database a value which your code doesn't expect. The unit test shouldn't be testing against real world data, but against data representitive of the possible values it should accept (ie, include all the obvious edge cases which should fail too, but are not likely to exist in the real world DB).