|
|
|
|
|
by jstummbillig
48 days ago
|
|
Simple: You can ask a LLM and can get a good explanation for why it did something, that will help you avoid bad behavior next time. Is that reasoning? Does it know? I might care about those questions in another context but here I don't have to. It simply works (not all the time, but increasingly so with better models in my experience.) |
|
I think there's something here to consider, but it's sort of like assuming that the LLM has reasons for doing things when it only has weights for which tokens are produced - thats the sum of its reasoning.
Maybe it's the case that LLM tokens to correlate to truth values or that this approach actually provides value but there's probably good reason to be skeptical, given that we'd need to posit some sort of causative function of "token outputs" to reasoning about prior behaviors.