|
Precisely this. People dismiss utility of LLMs because they don't give 100% reliability, without considering the basic facts that: - LLMs != ChatGPT interface, they don't need to be run in isolation, nor do they need to do everything end-to-end. - There are no 100% reliable systems - neither technological nor social. Voltages fluctuate, radiation flips bit, humans confabulate just as much if not worse than LLMs, etc. - We create reliability from unreliable systems. LLMs aren't some magic unreliability pixie dust that makes everything they touch beyond repair. They're just another system with bounded reliability, and can be worked into larger systems just like anything else, and total reliability can be improved through this. EDIT: In fact, my example with probabilistic primality tests is bad because those tests are too nice - they let us compute tight bounds on the error rate in advance. LLMs are not like that. But then, a lot of systems we rely in our daily lives also have this property - their reliability is established empirically, i.e. we improve them until they work reliably enough, and then we hope they'll keep on working, and deal with random failures when they occur. So that's nothing new, either. |
Saying LLMs are no worse than random bit flips is, again, an unjustified comparison. We can control bit errors with ECC, we cannot control the output of an LLM except to shackle it into uselessness.