Hacker News new | ask | show | jobs
by thaumasiotes 236 days ago
> Maybe better tactics will help. I'm not sure.

I don't see why they would.

If anyone is curious about the phenomenon, the second problem in session 7 at https://incredible.pm/ [ ∀x.(r(x)→⊥)→r(f(x)) ⟹ ∃x.r(x)∧r(f(f(x))) ] is one where the proof is straightforward, but you're unlikely to get to it by just fooling around in the prover.

1 comments

In principle, LLMs can do this already. If you ask Claude to express this in simple words, you will get this translation of the theorem:

    "If applying f to things makes them red whenever they're not already red, then there must exist something that is red AND stays red after applying f twice to it."
Now the proof is easy to see, because it is the first thing you would try, and it works: If you have a red thing x, then either x and f(f(x)) are both red, or f(x) and f(f(f(x)) are both red. If x is not red, then f(x) is red. Qed.

You will be able to interact like this, instead of using tactics.

For anyone else for whom the justification for “either x and f(f(x)) are both red, or f(x) and f(f(f(x)) are both red” was not immediately obvious:

H: ∀x.(r(x)→⊥)→r(f(x))

goal: ∃x.r(x)∧r(f(f(x)))

If f(f(x)) is red:

    x is a solution (QED).
Otherwise:

    f(f(x)) not being red means f(x) must have been (by contraposition of H) and that f(f(f(x))) will be (by H); therefore, f(x) is a solution.