Hacker News new | ask | show | jobs
by mherrmann 1969 days ago
I know Lean as a theorem prover. Can someone explain why it is now also a programming language? Is it so you can write a program, then prove its correctness in the same language? Or are there other synergies?
2 comments

Yes to all. In Lean 3, they found it was great being able to write tactics in Lean itself, which has led to many nice tactics in mathlib. Lean 4 consolidates many lessons learned, and most of Lean itself is written in Lean now. In principle they'll be able to prove different parts of the Lean system are correct (eventually).

Lean has a C backend, so you can write in Lean, prove your program meets the spec, then output reasonably optimized (but impenetrable) C code.

I made this raytracer partly to see how efficient that code would be (I haven't gotten around to comparing it to anything yet, though), partly to try writing a real program in Lean rather than my usual theorem proving, and partly to better understand ray tracers.

In principle, one could prove this raytracer (once fully debugged :-) ) converges to the theoretical solution to the rendering equation, with arbitrary precision floats. This would be an ungodly amount of work to carry out right now in Lean 4, but it's in principle doable, I think.

It would be really interesting to know how fast the LEAN 4 version is compared with C/C++.
I ended up writing one in the meantime -- check out the end of the readme. For this workload it seems to be a factor of 30x. The C program gets away with no heap allocation and inlining the entire render loop, though Lean does do quite a lot of unboxed float arithmetic. I tried to optimize the Lean and only shaved 25% off the runtime (code in a branch).

I suspect that it could be made a bit faster if each thread had its own copy of the scene. Lean handles reference counting of shared objects differently.

See the Curry Howard Correspondence: https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspon...

Essentially: all programming languages are logics, and all logics are programming languages; all programs are proofs, and all proofs are programs; all (static) types are propositions, and all propositions are (static) types; and so on for every facet of computer programming and theorem proving.

It's usually pointless to apply this to existing languages/logics/etc., since they end up being pretty terrible. For example, in a dynamically typed language like Python there is only one static type; hence if we view it as a logic there is only a single proposition; all Python programs are proofs of that same proposition, including trivial programs like '42', and hence the logic is completely trivial.

More sophisticated languages can still end up being pretty trivial. For example, in Java the value 'null' is a member of every type; hence from a logical perspective 'null' is a proof of every proposition. Hence Java also collapses into triviality (from the programming perspective: no matter how complicated we make our class hierarchy, or how long we make the dependency chains between our constructors, users of our API can just give 'null' for every argument and Java's type checker will gladly accept it)

Even more 'hardcore' languages like Haskell, which don't have the 'null' problem (but do have a less-severe problem called "bottom", see http://chriswarbo.net/blog/2020-02-09-bottom.html ), end up being pretty trivial too. For example, Curry-Howard tells us that function types correspond to implication propositions, so 'T1 -> T2' is a function type with argument type T1 and return type T2, and also the proposition that T1 implies T2 (i.e. if T1 is true, then T2 is true).

At first glance this looks really useful, e.g. we if we have a function with type 'CustomerRecord -> JSON' it's tempting to think of this as proof that CustomerRecords can be represented as JSON. Yet that's not what it says: in particular, the return type doesn't have any relationship to the input type. This function is actually stating that "if you give me any CustomerRecord, I can give you some JSON value". This is much less useful; e.g. we could implement this function by always returning the JSON value '"hello world"'.

Hence the Curry Howard Correspondence only tends to be useful for languages which were designed to take advantage of it. Lean is one example; others are Agda, Coq and Idris. In all of these languages, proofs and programs are the same thing; functions and implications are the same thing; conjunctions and tuples are the same thing; and so on. This lets us prove propositions about programs, programatically generate proofs, and any combination of the two.

Incidentally, the way that these languages avoid being trivial is by having dependent types. These let us state relationships between the input and output types of a function, like '(x: Int) -> Even (Double x)' ("if x has type Int, then doubling x is even"), or between the elements of a tuple, like '(x: Int, Even x)' ("a value x of type Int, and a proof that x is even"). I wrote a bit about this at http://chriswarbo.net/blog/2017-11-03-dependent_function_typ...

The logic equivalent of a dependent function type is a universally quantified proposition (i.e. "for all x, ..."). The logic equivalent of a dependent tuple type is an existentially quantified proposition (i.e. "there exists an x such that..."). This tuple idea can also be extended to name/value records too.