|
|
|
|
|
by drinfinity
1191 days ago
|
|
Confining an LLM to the very narrow domain of "calculators" is a mistake, I think. You wouldn't say "a programmer that is 99% correct is worthless, I need 100%". I'm pushing it, but for a more fair comparison I'd say measure it against a programmer. How often are we wrong? 75% of the time? :) being generous here. It's the tools that make us productive. I don't know about you specifically, but I don't think you'll be very productive with a bare terminal lacking any modern IDE-like or even REPL facilities. I'll ask you to come up with instantly working code every time, all the time. It doesn't work like that. You need iteration and I believe these kinds of AI have the same issues as us. There are wrong sometimes (often) and need feedback. |
|
It's funny how we resort to humanizing the machines when their results are inaccurate. We don't do that with the calculator, because it's expected to be 100% bug free. When there's a bug in the calculator code we expect it to be fixed, not gradually improved.
Speaking of bugs: mistakes in code is one thing, wrong output because of a fundamental flaw in the algorithm is another. The statistical machines we are dealing with work as intended, or at least the wrong output the top comment here brings up is not a bug, it's a feature. That's the difference.