Hacker News new | ask | show | jobs
by giraj 1402 days ago
This sounds like a cool demo, but I find it hard to imagine it being useful. For one, in order to check the output you need to be proficient enough that you might as well write the statement that you want. In addition, if you're working on a large library then there are small but important design decisions that matter for many statements, just like when designing ordinary software. I doubt that Codex is able to make coherent design decisions any time soon.

I would love for computers to be able to understand informal but rigorous mathematical reasoning, but, having worked with current proof assistants, it sure feels like software design will have to be solved in the same sense first.

That said, kudos for an awesome demo!

1 comments

I think one thing that this could help with is that, even if you are proficient in Lean, finding the way you are supposed to translate an informal statement into Lean/mathlib can be time-consuming if not a challenge -- it is a very large library, and it is mathematically heavy (being a library for theoretical mathematics). At the least, this gives you hints about which parts of mathlib you should be looking at.
Maybe! You're certainly correct that there's a difference between knowing Lean (or a given proof assistant) vs. knowing mathlib (or some specific library). It doesn't seem to me that Codex has much of an idea about the mathlib codebase, since it e.g. invented and used "tangent_space_at_identity" out of the blue. Library-specific training would improve on this, of course.

I would reach for the usual code search tools for getting familiar with a library. For example, in Coq you have the "Search" and "Print Hint" commands which let you search for terms and instances, respectively. I imagine Lean has something similar.

> It doesn't seem to me that Codex has much of an idea about the mathlib codebase, since it e.g. invented and used "tangent_space_at_identity" out of the blue

But then perhaps it's a hint to the user that this API may be an useful abstraction, and perhaps one can use the same tool to define this function