|
|
|
|
|
by diggan
351 days ago
|
|
> rely more on snapshot testing are spot on Never quite liked "snapshot testing" which I think has a better name under "golden master testing" or similar anyways. Reason for the dislike, is that it's basically a codified "Trust me bro, it's correct" without actually making clear what you are asserting with that test. I haven't found any team that used snapshot testing and didn't also need to change the snapshots for every little change, which obviously defeats the purpose. The only things snapshot testing seems to be good for, is when you've written something and you know it'll never change again, for any reason. Beyond that, unit tests and functional/integration tests are much easier to structure in a way so you don't waste so much time reviewing changes. |
|
- Don't store/commit the snapshot and have an "update" command. Your CI/CD should run both versions of the software and diff them. That eliminates a lot of the toil.
- You should have a completely trivial way to mark that a given PR intends to have observable changes. That could be a tag on GitHub, a square-bracket thing in a commit message, etc. Details don't matter a ton. The point is that the test just catches if things have changed, and a person still needs to determine if that change is appropriate, but that happens often enough that you should make that process easy.
- Culturally, you should split out PRs which change golden-affecting behavior from those which don't. A bundle of bug fixes, style changes, and a couple new features is not a good thing to commit to the repo as a whole.
The net effect is that:
1. Performance improvements, unrelated features, etc are tested exactly as you expect. If your perf enhancement changes behavior, that was likely wrong, and the test caught it. If it doesn't, the test gives you confidence that it really doesn't.
2. Legitimate changes to the golden behavior are easy to institute. Just toggle a flag somewhere and say that you do intend for there to be a new button or new struct field or whatever you're testing.
3. You have a historical record of the commits which actually changed that behavior, and because of the cultural shift I proposed they're all small changes. Bisecting or otherwise diagnosing tricky prod bugs becomes trivial.