|
|
|
|
|
by genrilz
595 days ago
|
|
You might not be able to sell someone a library that fixes all bugs, but you can sell (or give away) software systems that reduce the number of bugs. Doing that is pretty useful. Examples include linters, fuzzers, testing frameworks, and memory safe programming languages (as in Rust, but also as in any language with a GC). All these things reduce the number of bugs in the final product by giving you a way to detect them. (except for memory safe languages, which just eliminate a class of bugs) The paper is advertising a method to detect whether a given output is likely to be affected by a "bug", and a taxonomy of the symptoms of such bugs. The paper doesn't provide a way to fix those, and hallucinations don't necessarily have a single cause. Some hallucinations might be fixed by contextual calibration [0], others might be fixed by adding more training data similar to the wrong example. In any case, you need to find the bad outputs before you can perform any fixes. Because LLMs tend to be used to produce "fuzzy" outputs with no single right answer, traditional testing frameworks and the like aren't always applicable. [0] https://learnprompting.org/docs/reliability/calibration |
|
It's a panacea