| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sfink 570 days ago

    > These days I thus view asserts as falling into two categories:
    >    1. Checking problem domain assumptions.
    >    2. Checking internal assumptions.

(1) is the category of assert that should not be an assert. That is an error to be handled, not asserted.

Ok, to be fair, (1) is really a combination of two categories: (1a) assumptions about uncontrolled external input, and (1b) assumptions about supposedly controlled or known input. Both should be handled with error checking most of the time, but it's forgivable for (1b) to be asserted if it's too inconvenient to do proper error handling. (1b) and (2) are problems that you need to fix, and the sooner and clearer the issue is announced, the more likely and easier it is to be fixed.

One thing I didn't see mentioned is that asserts, especially category (2), enable fuzz testing to be vastly more effective. A fuzz test doesn't have to stumble across something that causes a crash or a recognizably bad output; it just needs to trigger an assert. Which is another reason to not use asserts for unexpected input; fuzzers are supposed to give unexpected input, and your program is supposed to handle it reasonably gracefully. If you over-constrain the input, then first you'll be wrong because weirdness will sneak in anyway, and second the fuzzer is harder to implement correctly and is less powerful. The fuzzer is supposed to be a chaos monkey, and it works best if you allow it to be one.

2 comments

josephg 570 days ago

Yeah I do this kind of fuzz testing all the time. Its an incredible way to test code.

For fuzz testing I go even further with asserts. I usually also write a function called dbg_check(), which actively goes through all internal data and checks that all the internal invariants hold. Eg, in a b-tree, the depth should be the same for all children, children should be in order, width of all nodes is between n/2-n, and so on.

If anything breaks during fuzz testing (which is almost guaranteed), you want the program to crash as soon as possible - since that makes it much easier to debug. I'll wrap a lot of methods which modify the data structure in calls to dbg_check, calling it both before and after making changes. If dbg_check passes before a function runs, but fails afterwards - then I have a surefire way to narrow in on the buggy behaviour so I can fix it.

link

hawski 570 days ago

I fully agree with you. Though I would add that it a little depends on the type of program. A CLI program's lazy, but often sufficient way of handling erroneous input are asserts (that are still enabled in release). In GUI app it shouldn't happen.

link

sfink 569 days ago

Yes. I mean, assert spew should probably be maximally informative, which means it's probably going to vomit out a big stack trace. So for heavily used CLIs (especially if the users are other people), it can be nice to handle the bulk of the common errors specifically (by emitting a brief and to the point error message). Bonus points for suggesting a root cause and what to do differently to make it not happen. And this isn't just unnecessary polish -- it's easy for the cause of the error ("/homw/sfink/.config/myapp not found") to get buried in the noise and for the user to not notice when they've fixed one thing and moved on to the next.

But that's only worth it for some CLI tools. For many, I agree that spewing out an assert failure is plenty good enough.

link