Hacker News new | ask | show | jobs
by anonacct37 946 days ago
I do like the approach of static code analysis.

I found it a little funny that their big "win" for the nilness checker was some code logging nil panics thousands of time a day. Literally an example where their checker wasn't needed because it was being logged at runtime.

It's a good idea but they need some examples where their product beats running "grep panic".

1 comments

Actually, if we were running into cases where we aren't logging a panic which is actually happening in production, then the first thing to note is that we need to improve our observability. The issue might or might not be recoverable, but it should be logged. If nothing else, it should show up as a service crash somewhere within those logs, which is also something that service owners monitor and get alerts on.

The advantage of NilAway is not just detecting nil panic crashes after the fact (as you note, we should always be able to detect those eventually, once they happen!), but detecting them early enough that they don't make it to users. If the tool had been online when that panic was first introduced, it would have been fixed before ever showing up in the logs (Presumably, at least! The tool is not currently blocking, and developers can mistake a real warning for a false positive, which also exist due to a number of reasons both fundamental and just related to features still being added)

But, on the big picture, this is the same general argument as: "Why do you want a statically typed language if a dynamically typed one will also inform you of the mismatch at runtime and crash?" "Well, because you want to know about the issue before it crashes."

Beyond not making it all the way to prod, there is also a big benefit of detecting issues early on the development lifecycle, simply in terms of the effort required to address them: 'while typing the code' beats 'while compiling and testing locally' beats 'at code review time' beats 'during the deployment flow or in staging' beats 'after the fact, from logs/alerts in production', which itself beats 'after the fact, from user complains after a major outage'. NilAway currently works on the code review stage for most internal users, but it is also fast enough to run during local builds (currently that requires all pre-existing warnings in the code to either be resolved or marked for suppression, though, which is why this mode is less common).

That makes sense. I hope my first comment came across as I intended, which wasn't as criticism of the product but a suggestion about how to talk about the value of the product.
No worries, that's how I understood it too :) Just adding a bit more context on why we feel the approach does beat "grep panic" for people checking out this thread.