| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by meroes 73 days ago
	Grammar seems like you’re talking about LLMs specifically. Well, isn’t Sudoku just math? LLMs suck at Sudoku last I checked. When told not to code a solver, its very first deduction was wrong.

1 comments

evenhash 72 days ago

Generally when people talk about using LLMs to do mathematics research they’re not talking about the LLM alone, but the LLM + a harness for it to write and execute theorem provers such as Lean or Coq to validate their results.

link

meroes 72 days ago

I guess I just don’t have the experience or optimism that a harness around an LLM, which can’t make the first, bare deduction on its own, is a good use of compute.

I got out of RLHF, including games and puzzles, before agents took off and maybe I have outdated info. But we estimated RLHF’ing a single hard full sized sudoku was ~25 hours worth of work.

link