|
|
|
|
|
by jcims
1596 days ago
|
|
These language models feel, to me, like the unfiltered self. If someone asked me what 838+1283 was my head would instantly offer up some number 2301 or something. But i would discard that number because I learned in elementary school that I don't come up with good values, I need to execute a process in order to get the right value. I imported the csv version and I'm no statistician but 90% percentile relative error is 8.6%, which is something like this: What is 22730 - 24978? -2448 (real answer -2248) That's totally within range of something that would plop into my head...with one exception. Of 1000 entries, only five have an incorrect last digit. I think that's meaningful...it almost tells me that there's a multi-stage operation happening in there somewhere. |
|
A generator-critic framework with multiple rounds of iteration would improve on the limitations of the LM.