| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mwilliamson 1950 days ago

(I'm the author of precisely)

If you were to ask me "Should I use precisely in my tests?", my answer would be: it depends. The main benefit is to better describe the intent of your test, so that the assertion is neither under- nor over-specified, with the intent directly stated.

Assuming your assertion is meant to be an alternative to the example in the README:

    assert_that(result, contains_exactly("a", "b"))

I'd suggest that the above states the intention of the assertion, rather than how you check it. For instance, your assertion would allow duplicate elements, whereas the assertion as originally written would suggest that this isn't desired. As other comments have pointed out, you can do things with sorted, Counter or set (depending on exactly what you want to assert), but why worry about what trick to use when you could just directly state your intention?

The assertion using precisely is also (arguably) easier for a reader to know what is (and isn't) being asserted in the test, and makes the test less brittle since you're not accidentally asserting more than intended (for instance, it's common to assert equality with a list, even though you don't actually care about order).

Another common case is when you want to make assertions on a collection, but equality would check too much. For instance, suppose you have a function that fetches users from a database. The fetch can return the users in any order, and you just want to check the names of the returning users, so you can write something like:

    assert_that(result, contains_exactly(
        has_attr(name="Alice"),
        has_attr(name="Bob"),
    ))

How would you write something that means the same thing without precisely? The order isn't deterministic, so you can't write something like:

    assert result[0].name == "Alice"
    assert result[1].name == "Bob
    assert len(result) == 2

We could sort the users by name before making the assertion:

    result = sorted(result, key=lambda user: user.name)
    assert result[0].name == "Alice"
    assert result[1].name == "Bob
    assert len(result) == 2

Personally, I prefer the precisely assertion!

What about something like an equality assertion?

    assert set(result) == {
        User(id=1, name="Alice", email_address="alice@example.com"),
        User(id=2, name="Bob, email_address="bob@example.com"),
    }

Now we've over-specified our test -- we need to know irrelevant details like the ID and e-mail address of the users, which might change and break this test even when the functionality we care about still works. We'll also break the test if we add any more attributes to users.

As you've mentioned, there's a cost to learning precisely. Even if you're familiar with the library, then you're still potentially writing a more complex assertion (where more can go wrong) than just (for instance) an equality assertion. In my experience, on many projects, the ability to state the intent of assertions precisely has far outweighed the downsides, but that's from a position of already being comfortable with the library.