Hacker News new | ask | show | jobs
by cesaref 2 days ago
Is it only me that would have expected curl_getenv() to have an assert that it's argument isn't NULL?

I know this doesn't stop runtime problems in release builds, but i'd have thought this sort of simple precondition check would help users find problems in their library useage.

It's not going to stop you passing a non-terminated string, or other such invalid input though, which is I guess more the point, that it's totally possible in C to produce good looking but actually invalid arguments that can't be spotted at runtime without UB (out of bounds access etc).

Edit: Actually thinking about this more, I guess the problem is that you are likely linking against a release library implementation, so it's not possible to add a precondition without introducing a runtime overhead, which is probably more likely what we are talking about with this case.

3 comments

If I were doing a code review, I would probably accept the code either with or without the assertion. The context of curl_getenv() makes it clear that null is not acceptable. If the author of curl_getenv() had evidence that callers are frequently breaking the contract by passing null, then perhaps the assertion would help shed some light on violators. Otherwise, I would expect everyone to play by the rules, making the assertion unnecessary.
It's also just a wrapper around getenv that provides consistent behavior across platforms, and passing a NULL name to the POSIX getenv function is UB.
That is exactly why you have a precondition or assertion.

If everyone expects specific behavior - ie it’s in the contract - you require that contract.

The problem with asserts is that they are pretty dramatic and you crash the entire program.

We generally did this in the avahi libraries, be fairly liberal with asserts that "shouldn't happen", it is a source of complaints though because basically you can be using a third party library that uses avahi and have your program crash due to a bug in that library, or in avahi. It's extra fun when using some historical libc systems such as "NSS" and you load a plugin to do hostname resolution, which nss-mdns does.. now you can have any program on the entire system crash if you are assert happy.

On the one hand I agree that if the result is going to be memory un-safety then perhaps you should assert, but more ideally you'd just fail gracefully and throw or return an error. That can sometimes be tricky though, if there is no good way to return an error or return a NULL value or similar. Depending on the API.

Of course, this is the entire reason behind the error return traditions of Golang and Rust, e.g: https://doc.rust-lang.org/book/ch09-00-error-handling.html

Which basically says what I said above :)

But in the case of curl_getenv, returning NULL seems a valid possibility (https://curl.se/libcurl/c/curl_getenv.html) as that is indicated to be done if you don't find the requested environment variable. Arguably the NULL environment variable is not found. so, this feels likely to be acceptable. Though I could see an argument for you now assuming the environment variable you were actually looking for not existing, but you didn't actually ask for one, and now your logic is broken and maybe you introduce a different class of security bug because you change your behaviour based on some environment variable not existing.

As always everything is a trade-off...

Returning to the context of this post, this is one of the things I really like about rust. (And zig, haskell, typescript, swift and others). These languages make invalid states impossible to represent. If my function takes a value of type T (or &T), you can't accidentally receive NULL. So you just don't need to worry about this stuff any more. The compiler simply won't compile the program if type checking fails. At runtime, I only have to consider valid values.
Zig still doesn't have a way to represent that a pointer to a heap allocated region is no longer valid.
Crashing a program is always a much better alternative than behaviours that silently lead to memory corrupt, having much severe outcomes than a crash.

Ah but what high integrity computing, well there neither crashes nor memory corruption are welcomed, hence programming guidelines and certification workflows that would make most C devs cry with the language features they are allowed to use, and how each line of code gets analysed by tools and humans.

Yes, but null pointers are so pervasive in C code that we really can't afford to put assertions everywhere. It's often better to let the app crash on violations.
An assertion is an app crashing on a violation. The problem is when it's not guaranteed to crash, and instead does something very wrong.
A bug is a bug even when it doesn't clearly manifest itself 100% of the time, and furthermore it is pretty much guaranteed that NULL dereference crashes with segfault in practice, only not for the people playing theoretic games whose essence of life is finding gotchas where it maybe isn't so and then feeling smarter than everyone else.

But it's >> 99.9% true that this will just crash even though it's acshually UB, nasal demons and so forth. Now raise this << 0.1% likelihood that it isn't true on some system with some compiler and build flags, to the power of the number of distinct deployed configurations out there, and you get the result which is the correct engineering decision of just moving on instead of spending your life filling straightforward code with pointless boilerplate assertions.

NB it can make sense to assert nonnull when the condition won't be tested on all code paths or the intention is otherwise not super obvious.

> it's >> 99.9% true that this will just crash even though it's acshually UB, nasal demons and so forth.

Is it though? Linux saw enough bugs from that kind of issue that they now build with -fno-delete-null-pointer-checks and accept the (supposed) performance penalty.

I don't want to nitpick people often but your use of division sign to mean percent is really throwing me off.
An assert is not guaranteed to terminate the process. In C, the most common implementation choice is to completely omit the check if you're not building in debug mode.
You need to turn it off by defining NDEBUG. While sometimes it is not for release builds, I am not sure this is common.
> it's not possible to add a precondition without introducing a runtime overhead

Indeed. Adding an assertion to a single function isn't a big deal, but if every function has to check all of it's arguments, that's going to add up. And even if you could have the assertion only in debug builds, that isn't enough unless you have a very exhaustive test suite, because an edge case could trigger undefined behavior in production in a way that wasn't exercised during testing.

In fact, the fact that the rust compiler adds runtime checks for array indexes if it can't prove the index is in bounds is a criticism some c programmers have of rust.

> In fact, the fact that the rust compiler adds runtime checks for array indexes if it can't prove the index is in bounds is a criticism some c programmers have of rust.

And the fact that after a half a century we're still debating how much we really need to care about U stuff like this when we get severe bugs in a major piece of software written in C seemingly every week is a criticism that pretty much all Rust programmers have of C.

Considering the amount of C programs that exist, the "we see severe bugs in C code seemingly every week" is on the same level of propaganda as we see "crime in the news every week" when the real societal problems are entirely different.
It is so bad as C culture, that the only way to fix the culture is by having hardware where those C programmers no longer have a say on bounds checking.

Most systems languages, with exception of C, have ways to do bounds checking, even C++ and Objective-C, by using the respective collection classes.

In addition to an assert, a 'nonnull' attribute on compilers that support it seems like a good idea.
If you use the nonnull attribute, the assertion will be optimized away.
But the compiler tells you about the issue.