|
|
|
|
|
by isotypic
536 days ago
|
|
> and finally find a path to the solution. But how does the student, or in your case the LLM, know that it actually has the solution? For students, this is done by: a grader grading the homework, asking the professor at OH, working on problems with other peers who crosscheck as you go. I see no reason why this LLM produced synthetic data, without this correction factor, would not devolve into a mess of incorrect, maybe even not-even-wrong style "proofs". And then how can training on this yield anything? |
|
(In principle, it should be also be possible to get good enough at philosophy to avoid devolving into a mess of incoherence while reasoning about concepts like "knowledge", "consciousness", and "morality". I suspect some humans have achieved that, but it seems rather difficult to tell...)