Hacker News new | ask | show | jobs
by BalinKing 162 days ago
The file SSOT.lean is completely trivial, I think: Unfolding the definitions in the theorems, they say nothing but "x=1 => x=1", "x=1 => x≤1", and "x≠1 => x=0 ∨ x>1" (where x is a natural number). Basically, there's no actual proof here, at least not in that file....

This is indeed the danger of letting LLMs manage a "proof" end-to-end—they can just pick the wrong definitions and theorems to prove, and even though the proofs they then give will be formally sound, they won't prove anything useful.

2 comments

You only read the 37 lines in SSOT.lean and stopped. It's the entry point that defines DOF=1 so other files can import it. The actual proofs are in Foundations.lean (364 lines - timing trichotomy, causality), Requirements.lean (derives the two necessary language features), Completeness.lean (mechanism exhaustiveness), Derivation.lean (the uniqueness proof that achieves_ssot m = true iff m = source_hooks), Coherence.lean, CaseStudies.lean, LangPython.lean, LangRust.lean etc.

~2k lines total across the lean files. Zero sorry. Run grep -r "sorry" paper2_ssot/proofs/ if you don't believe me.

"Unfolding the definitions they say x=1 => x=1" applies to three sanity lemmas in the scaffolding file. It's like reading __init__.py and concluding the package is empty.

See my other comment—LangRust.lean is the same way.

EDIT: Just skimmed Completeness.lean, and it looks similar—at a glance, even the 3+-line proofs are very short and look a lot like boilerplate.

Interesting that you're using em dashes in your comments. Those require Alt+0151 or copy-paste. Glass houses.
And Option-Shift-Hyphen in macOS, which is easy if you know it. And a press and hold on a hyphen on iOS, which is discoverable, even.
Yeah, I'm on macOS (although even back on Windows, I used to use the Character Map all the time).
Fair, the em dash comment was a cheap shot. Withdrawn.

The substantive point stands: you've now "skimmed" multiple files, called them all "boilerplate," and haven't engaged with the actual proof structure. The rebuttals section addresses "The Proofs Are Trivial" directly (Concern 9).

At some point "I skimmed it and it looks trivial" stops being a critique and starts being "I didn't read it."

I also took a look at the `LangRust.lean`, and the majority of the proofs are just `rfl` (after an `intros`)—that's a major red flag, since it means the "theorems", like those in SSOT.lean, are true just by unfolding definitions. In general, that's basically never true of any interesting fact in programming languages (or math in general); on the contrary, it takes a lot of tedious work even to prove simple things in Lean.
Yes, many proofs are rfl. That's because we're doing engineering formalization, not pure math. The work is in getting the definitions right. Once you've correctly modeled Rust's compilation phases, item sources, and erasure semantics, the theorem that "RuntimeItem has no source field, therefore query_source returns none" should be rfl. That's the point.

The hard part isn't the proof tactics. The hard part is:

- Correctly modeling Rust's macro expansion semantics from the language reference

- Defining the compilation phases and when information is erased

- Structuring the types so that the impossibility is structural (RuntimeItem literally doesn't have a source field) If the theorems required 500 lines of tactic proofs, that would mean our model was wrong or overcomplicated. When you nail the definitions, rfl is the proof.

Compare to software verification: when you prove a sorting algorithm correct, the hard work is the loop invariants and the model, not the final QED. Tedious proof steps usually indicate you're fighting your abstractions.

The real question isn't "are the proofs short?" It's "can you attack the definitions?" The model claims RuntimeItem erases source info at compile-to-runtime. Either produce Rust code where RuntimeItem retains its macro provenance at runtime, or accept the model is correct. The rfl follows from the model being right.

> Compare to software verification: when you prove a sorting algorithm correct, the hard work is the loop invariants and the model, not the final QED. Tedious proof steps usually indicate you're fighting your abstractions.

This is a false statement when working with an interactive theorem prover like Lean. Even trivial things require mountains of effort, and even blatantly obvious facts will at least require a case analysis or something. It's a massive usability barrier (and one that AI can hopefully help with).

This is addressed in the paper's Preemptive Rebuttals section (Concern 9: "The Proofs Are Trivial").

At 2k lines of lean, the criticism was "these proofs are trivial." At 9k lines of lean with 541 theorems, the criticism is... still "trivial"? At what point does the objection become "I didn't read it"?

The rfl proofs are scaffolding. The substantive proofs (rust_lacks_introspection, Inconsistency.lean, Coherence.lean) are hundreds of lines of actual reasoning. This is in the paper.