| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jstimpfle 1 day ago

You can try and do this if it's a relatively narrow public facing API, but otherwise this is a theoretic ideal. In practice, if you add an assertion for every pointer argument to every little function, you'll go insane, and it is completely pointless, and the code will not be readable anymore.

There are so many other interesting and relevant invariants that are usually in an API contract that are much harder or impossible to check upfront (let alone express formally in a type system), and even violations may be impossible to diagnose when they happen.

People focus on NULL because that's the only way they can apply their silly limited type systems. But NULL checks give very little return for investment. In practice, you'll see templated Option<T> types and whatnot, and when I have to look at or even work with such code I want to kill myself because it's so painful.

1 comments

lmm 1 day ago

No, people focus on a handful of things like null, buffer overrun, and use-after-free because they still make up the majority of security vulnerabilities that we see exploited in the wild. You may imagine that subtle logic errors are more common, but the data doesn't bear that out; also FWIW I've never seen one of these detailed invariants be impossible to express in a type system if you spend 5 minutes actually trying.

link

jstimpfle 20 hours ago

Typical invariant for me would look like:

Given a, b, c input parameters to my func, it must hold that that a->m->t == b->t. c->mutex must be held, and c->cond is the condition variable that goes with c->mutex and will release any waiters on the buffer contained in a.

Or: Integer x is representable using 12 bits only, Integer y should be a multiple of N and I have a integer s is used as a bit-shift that should be less than 8.

Or: I need to guarantee that no locks have to be taken and no allocations have to be made on this complicated looking codepath. While holding a lock, we must not do any syscalls (syscall a, b, c are ok though), and surely not make any logging calls.

I know only one system that can express this, it's called STRAIGHTFORWARD CODE, and it requires doing engineering and casual logic out-of-band, and yes it does include making mistakes and repairing them incrementally.

I don't know a type system that would let me explain these things to me and tell me where I was wrong. But maybe you can show me, with 5 minutes of actually trying?

link

lmm 20 hours ago

> it must hold that that a->m->t == b->t.

So define a wrapper type that represents that invariant (it's not going to take up space at runtime), where the only constructor enforces it?

> Integer x is representable using 12 bits only, Integer y should be a multiple of N and I have a integer s is used as a bit-shift that should be less than 8.

Those are all standard things that already exist?

> Or: I need to guarantee that no locks have to be taken and no allocations have to be made on this complicated looking codepath. While holding a lock, we must not do any syscalls (syscall a, b, c are ok though), and surely not make any logging calls.

Sounds like a pretty standard free monad case? Define a command algebra in which the "ok" syscalls are a subtype, and then require that the thing you want to only use the ok calls to have a type that reflects that?

link

jstimpfle 20 hours ago

Please, go ahead and type the example. I think you are trolling.

> Define a command algebra in which the "ok" syscalls are a subtype

Dude, it's clear you're not doing any actual work. You are living in an ivory tower, and you underestimate the complexity and detail and volatility of real world applications by at least 3 orders of magnitude. You don't understand how to modularize and contain complexity.

You _cannot_ complete a project with this attitude.

You are ignorant of the fact that a type system is necessarily a blunt simplification of the real complexity. Therefore, use of types must be pragmatic, and actual logic must be coded in normal code (which should be obvious but it isn't to type theory weirdos). Otherwise, you require dependent typing or whatever, and you will have to write your code twice, once in a usable programming language and once in a very unusable programming language. Much more than only twice actually, given that all the implicit detail should apprently go explicitly formalized at the type level.

Just to make sure I'm not entirely talking out of my arse because I'm so incredibly annoyed by your otherworldly proposition, I asked an AI about the sel4 microkernel. It consists of 10,000 lines of C code (that says a lot about its practical utility, which is very limited), and of 1,3 million lines of manually written proof code (which says a lot about the practicality of proving).

link