| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by alhazrod 43 days ago

Hopefully some people find this interesting too:

TLAiBench[0]: A dataset and benchmark suite for evaluating Large Language Models (LLMs) on TLA+ formal specification tasks, featuring logic puzzles and real-world scenarios.

[0]: https://github.com/tlaplus/TLAiBench