Hacker News new | ask | show | jobs
by alhazrod 43 days ago
Hopefully some people find this interesting too:

TLAiBench[0]: A dataset and benchmark suite for evaluating Large Language Models (LLMs) on TLA+ formal specification tasks, featuring logic puzzles and real-world scenarios.

[0]: https://github.com/tlaplus/TLAiBench