We work with a rather large upstream scientific codebase in python. They have an innate distrust of anything that isn't written by them.
Their testing system depends on tests printing "OK" after every test. This means that in many cases, tests failing are indicated by the _absence_ of "OK" being printed.
(We've attempted to isolate those parts and write our own stuff testing against upstream in pytest. We once presented a proposal to move them to pytest, offering to do any work and even wrote pytest plugins to seamlessly integrate with their current system. We got a - literal - "Thanks, but no thanks.")
That’s not the way I would do it, but is there really a problem here? Assuming the library itself isn’t littered with print statements that cause false positives.
Experience has taught me that the “right” testing framework for a project is whatever the developers are happy and productive with.
I've had unit test suites in the past that failed to run the test if the test failed to compile. Those were the worst. I only found out because I roughly knew how many tests I expected to be run
I’ve shipped unit tests that would fail but don’t run by copying the function signature of the previous test then forgetting to change the name of the new one. Only one of the tests will run (the first?).
Maybe unit tests need unit tests? (There’s probably a lint rule to catch what I describe above)
> Maybe unit tests need unit tests? (There’s probably a lint rule to catch what I describe above)
Yep - meta-testing (ensuring that every unit test that exists in a project adds unique coverage, remains valid, runs as expected, and I'm sure many other properties) could (and should!) definitely be automated.
Some more advanced meta-testing could involve tracking changes to a project's source history over time (in other words: tests that run with commit history). By that I'm thinking of situations like: "does this test genuinely still test what it used to, after the test and/or application code was modified?"
mutation testing is one example: if you make random changes (random in terms of transforming valid code to different valid code) to the code being tested, you should expect that the test will then fail. If not, there is some part of the code's behaviour which is not being tested.
Wow, that sounds like the future of testing. It’d be a hard to sell to manager now though. Some of those checks seem like they could be auto-generated though
The second one will run, because as the file gets executed top down, the second declaration overwrites the first declaration, just like when you reassign a variable.
But yeah that would be a good thing for a linter to catch. I'm not aware if any do.
Huh? Seems super clear to me that "assertTrue" is asserting Truthyness and not equality. It's right there in the method name! And if you don't know "True" means Truthyness in Python, they you don't know the basics of Python.
A reviewer should catch this error easily. I kind of think many don't give much attention to unittests when reviewing. Which is bad. Good unittests are far harder to write than good code.
There's much more subtle errors of this class (False Negatives / always pass).
Given that this is a not uncommon mistake, despite the name, indicates that people make mistakes. They don't read. They're in a hurry. They see what they want to see.
The fix isn't to blame people for making mistakes. It's to figure out a design that doesn't allow this mistake to happen in the first place.
For example, the method could (today) require the second argument to be a keyword argument. This is also something a good linter should be able to warn on.
edit: rikatee and I wrote essentially the same reply at the same time. :-)
Agreed in perfect world, but unfortunately any process that involves humans will involve human error.
We do code review because we expect human error when the code was written by a human, but then we also expect not human error when the code is being read (reviewed) by a human? Any process that expects zero human error will always fail.
That's where linters add value: they allow devs to do what humans are good at (the creative complex and interesting stuff) while the bots do what bots are good at (the boring repetitive stuff)
I wonder if they meant Schrödinger - the test could be both passing or failing, but we don't know until we use the correct function to check the results.
Note that Schrödinger's thought experiment is intended to ridicule this way of thinking. Schrödinger is trying to suggest that since it's clearly nonsensical to imagine that maybe a whole cat can be both dead and alive the same would be true for other macroscopic subjects.
Instead popular culture has decided that at best, this is what Schrödinger believed (Ha those crazy scientists) and at worst that somehow the cats being dead and not-dead at the same time is the core idea of quantum physics :/
Writing software for a long time really changes your perspective on things. It no longer seems weird that "the" cat is both alive and dead. It's just compression between two universes and until we open the box our perspective is the same.
> and at worst that somehow the cats being dead and not-dead at the same time is the core idea of quantum physics
Yet people are keeping larger and larger objects in a coherent state. Probably nobody will ever do it with a cat, but quantum physics is keeping its tradition of taking anything people think as absurd and saying "well, not really, look at this".
That typo/misread (Schroeder instead of Schrödinger) honestly almost causes it to make more sense to me. Because I don't really see how those tests relate to quantum mechanics. Instead comparing those tests to a chancellor of the local labor party that was expected to help the situation of workers in the country, but essentially only used the office as a stepping stone to become a russian oligarch, making situation even worse in the process, makes plenty of sense to me...
Though it's unlikely to get made in the existing testing library because it's hugely breaking, the API would be better if the assertXxx methods’ optional message argument were keyword-only, and assertTrue (and assertFalse) were replaced with assertIsTrue and assertTruthy (and assertIsFalse and assertFalsey.)
the post covers the built-in unittest package, which 28% of devs still use. But pytest is nicer to work with. I think brownfield codebases and inertia are the reason 28% of devs work (or have to work) with unittest
Pytest easily runs unittest codebases, and you can just start writing new tests in pytest, and gradually move to it. Most of those left in pure unittest land are probably in some category of "amount of legacy is too large" or "I don't care anymore", and most probably some amount of both.
I actually prefer unittest style tests to pytest. I hop between languages a lot and find them easier to remember how to write when I'm doing Python. I also place a lot of value on minimizing the dependencies that I have to install and every codebase I see using pytest seems to also have to pull in a dozen other pytest plugins that then have to be reviewed, pinned, and updated. I also feel like whatever niceties pytest brings in to make writing tests easier are balanced out on Django apps by having to add a `@pytest.mark.django_db` decorator on basically every single test function.
Regarding Django, if you add `pytestmark = pytest.mark.django_db` to the top of your file or organize your tests in classes and decorate those, then you won't have to decorate every single test :)
I mean, pytest ones are the easiest. They are just functions. If you're not doing anything fancy, then you don't need to do anything else. Use normal assert, not some fancy functions. Only plugin I really use is coverage.
Minimizing runtime dependencies is nice, but personally I couldn't care less about build/test time dependencies.
I don't touch Django so can't comment on that though.
unittest is included in the Python standard library. Adding third-party libraries is a huge step to take for a project, and just “nicer to work with” does not cut it. Third-party libraries come and go, and depending on one means being subject to the storms of changes and lulls of inactivity and death. But the standard library is dependable.
I should have written “Adding each extra third-party library as an additional dependency is always a huge step to take for a project”.
Any one third-party library comes with these drawbacks, and each library must be evaluated individually. Some may be worth the pain (requests, etc.), but many are not. One should try to minimize the number of third-party dependencies one has, not necessarily eliminate them entirely. It’s simply that the lower number of third-party libraries you depend on, the less pain you get of the kinds I listed. Every individual third-party library will have to overcome that threshold by being useful enough. And I doubt that, for most people, pytest is that much better to work with than the built-in unittest is.
> assertTrue also accepts a second argument, which is the custom error message to show if the first argument is not truthy. This call signature allows the mistake to be made and the test to pass and therefore possibly fail silently.
"assertTrue also accepts a second argument, which is the custom error message to show if the first argument is not truthy. This call signature allows the mistake to be made and the test to pass and therefore possibly fail silently."
Bear in mind only 28% of codebases actually use built-in unittest package that this gotcha is affected by, so really it's 20 of 28% of 666 aka 10% ... but that claim would be hard to justify by folks that dig stats.
Their testing system depends on tests printing "OK" after every test. This means that in many cases, tests failing are indicated by the _absence_ of "OK" being printed.
(We've attempted to isolate those parts and write our own stuff testing against upstream in pytest. We once presented a proposal to move them to pytest, offering to do any work and even wrote pytest plugins to seamlessly integrate with their current system. We got a - literal - "Thanks, but no thanks.")