Hacker News new | ask | show | jobs
by torginus 310 days ago
It's a bit different for reasoning LLMs - they operate in a feedback loop, measuring the quality of the solution and iterating on it until either the quality meets a desired threshold, or all reasoning effort is expended.

This can correct for generation errors, but cannot correct for quality measurement errors, so the question is valid.