| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cpeterso 114 days ago

Looks like LLMs also find Dafny easier to write than Lean. This study, “A benchmark for vericoding: formally verified program synthesis”, reports:

> We present and test the largest benchmark for vericoding, LLM-generation of formally verified code from formal specifications … We find vericoding success rates of 27% in Lean, 44% in Verus/Rust and 82% in Dafny using off-the-shelf LLMs.

https://arxiv.org/html/2509.22908v1

1 comments

nextos 114 days ago

Not surprising, as Dafny is a bit less expressive (refinement instead of dependent types) and therefore easier to write. IMHO, it hits a very nice sweet spot. The disadvantage of Dafny is the lack of manual tactics to prove things when SAT/SMT automation fails. But this is getting fixed.

link