Hacker News new | ask | show | jobs
by jstimpfle 1 day ago
May. If. If. If. In case.

We are talking about an extremely simple straightforward API with an obvious contract. It's good enough for this function to reliably surface almost all wrong uses with a segfault immediately. Wrong use will result in segfaults and otherwise bugs and crashes. The goal is not to work when used wrong but to work when used right. You cannot save the world from scratch in every little function. You still have a job to get done, and you have to move on.

1 comments

> You cannot save the world from scratch in every little function. You still have a job to get done, and you have to move on.

Or you can take all of 10 minutes to put sanity-check assertions at the start of all your public-facing API functions, eliminating a source of security bugs, get on with your life, and worry about the performance implications as and when it becomes a problem (hint: it's never going to become a problem).

You can try and do this if it's a relatively narrow public facing API, but otherwise this is a theoretic ideal. In practice, if you add an assertion for every pointer argument to every little function, you'll go insane, and it is completely pointless, and the code will not be readable anymore.

There are so many other interesting and relevant invariants that are usually in an API contract that are much harder or impossible to check upfront (let alone express formally in a type system), and even violations may be impossible to diagnose when they happen.

People focus on NULL because that's the only way they can apply their silly limited type systems. But NULL checks give very little return for investment. In practice, you'll see templated Option<T> types and whatnot, and when I have to look at or even work with such code I want to kill myself because it's so painful.

No, people focus on a handful of things like null, buffer overrun, and use-after-free because they still make up the majority of security vulnerabilities that we see exploited in the wild. You may imagine that subtle logic errors are more common, but the data doesn't bear that out; also FWIW I've never seen one of these detailed invariants be impossible to express in a type system if you spend 5 minutes actually trying.
Typical invariant for me would look like:

Given a, b, c input parameters to my func, it must hold that that a->m->t == b->t. c->mutex must be held, and c->cond is the condition variable that goes with c->mutex and will release any waiters on the buffer contained in a.

Or: Integer x is representable using 12 bits only, Integer y should be a multiple of N and I have a integer s is used as a bit-shift that should be less than 8.

Or: I need to guarantee that no locks have to be taken and no allocations have to be made on this complicated looking codepath. While holding a lock, we must not do any syscalls (syscall a, b, c are ok though), and surely not make any logging calls.

I know only one system that can express this, it's called STRAIGHTFORWARD CODE, and it requires doing engineering and casual logic out-of-band, and yes it does include making mistakes and repairing them incrementally.

I don't know a type system that would let me explain these things to me and tell me where I was wrong. But maybe you can show me, with 5 minutes of actually trying?

> it must hold that that a->m->t == b->t.

So define a wrapper type that represents that invariant (it's not going to take up space at runtime), where the only constructor enforces it?

> Integer x is representable using 12 bits only, Integer y should be a multiple of N and I have a integer s is used as a bit-shift that should be less than 8.

Those are all standard things that already exist?

> Or: I need to guarantee that no locks have to be taken and no allocations have to be made on this complicated looking codepath. While holding a lock, we must not do any syscalls (syscall a, b, c are ok though), and surely not make any logging calls.

Sounds like a pretty standard free monad case? Define a command algebra in which the "ok" syscalls are a subtype, and then require that the thing you want to only use the ok calls to have a type that reflects that?

Please, go ahead and type the example. I think you are trolling.

> Define a command algebra in which the "ok" syscalls are a subtype

Dude, it's clear you're not doing any actual work. You are living in an ivory tower, and you underestimate the complexity and detail and volatility of real world applications by at least 3 orders of magnitude. You don't understand how to modularize and contain complexity.

You _cannot_ complete a project with this attitude.

You are ignorant of the fact that a type system is necessarily a blunt simplification of the real complexity. Therefore, use of types must be pragmatic, and actual logic must be coded in normal code (which should be obvious but it isn't to type theory weirdos). Otherwise, you require dependent typing or whatever, and you will have to write your code twice, once in a usable programming language and once in a very unusable programming language. Much more than only twice actually, given that all the implicit detail should apprently go explicitly formalized at the type level.

Just to make sure I'm not entirely talking out of my arse because I'm so incredibly annoyed by your otherworldly proposition, I asked an AI about the sel4 microkernel. It consists of 10,000 lines of C code (that says a lot about its practical utility, which is very limited), and of 1,3 million lines of manually written proof code (which says a lot about the practicality of proving).

It takes a lot longer to figure out if it'll be a problem than to just add the check. And you don't have to ponder whether it's possible for a null to get there, because now it's fine if it does.
Are you talking about extending the API contract to allow for NULL? That is often the path to madness, especially if it requires complicating the signature (return value etc). Better to just assert/crash.
No. I'm talking about adding the check to reject NULL. Then you don't have to spend time justifying or figuring out why a NULL can't turn up here.
So reject as in assert? But how does that go together with what you said, "because now it's fine if it does"?
Because no one is expecting it to work if a null is passed. Your total range of behaviours left are crashes, doesn't crash and is silently ok, or doesn't crash and causes something worse (data corruption, you get your product in a CVE, that area).

My proposition is that "it's silently ok" isn't likely enough, which is in line with your position on "don't extend the contract to accept null". So what's left is crash, or something worse.

So if those are your choices, don't waste time justifying that a null can't get there, just add a check to ensure you get the better behaviour. It takes seconds.