| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by slushy-chivalry 761 days ago
	maybe theorem proving could help? ask gpt4o to produce a proof in coq and see if it checks out...or split it into multiple agents -- one produces the proof of the closed formula for the tape roll thickness, and another one verifies it

2 comments

jamilton 761 days ago

I had the thought recently that theorem provers could be a neat source of synthetic data. Make an LLM generate a proof, run it to evaluate it and label it as valid/invalid, fine-tune the LLM on the results. In theory it should then more consistently create valid proofs.

link

ryanianian 761 days ago

Sure, but those are heuristics and feedback loops. They are not guaranteed to give you a solution. An LLM can never be a SAT solver unless it's an LLM with a SAT solver bolted on.

link

slushy-chivalry 761 days ago

I don't disagree -- there is a place for specialized tool, and LLM wouldn't be my first pick if somebody asked me to add two large numbers.

There is nothing wrong with LLM + SAT solver -- especially if for an end-user it feels like they have 1 tool that solves their problem (even if under the hood it's 500 specialized tools governed by LLM).

My point about producing a proof was more about exploratory analysis -- sometimes reading (even incorrect) proofs can give you an idea for an interesting solution. Moreover, LLM can (potentially) spit out a bunch of possibly solutions and have another tool prune and verify and rank the most promising ones.

Also, the problem described in the blog is not a decision problem, so I'm not sure if it should be viewed through the lenses of computational complexity.

link