|
|
|
|
|
by volkercraig
17 days ago
|
|
More than that, the entire structure of the study is pointless. They set up as a question/response and then had humans rate the response. That's literally what LLM's are trained to do, which ultimately is convincing a human to click the "I like this one better" button on it's response. |
|
Convincing a human law professor to click the "I would prefer to deliver this response to a student" button, and to not click the "this response is pedagogically harmful" button is a different task!
I could imagine an LLM convincing a typical human to click the "I like this one better" button with flattery, or with nice-sounding platitudes, or with hand-wavey explanations that sound plausible. And in fact that's exactly what LLMs do when they go wrong - they bluff and output superficially plausible nonsense!
But these weren't typical humans, these were law professors specifically tasked with deciding which response was a better option to give to students as a canonical answer to a contract law question. So I think this is a genuinely impressive result.