|
|
|
|
|
by resource_waste
778 days ago
|
|
I feel like this is the perfect application of running the data multiple times. Imagine having ~10-100 different LLMs, maybe some are medical, maybe some are general, some are from a different language. Have them all run it, rank the answers. Now I believe this can further be amplified by having another prompt ask to confirm the previous answer. This could get a bit insane computationally with 100 original answers, but I believe the original paper I read was that by doing this prompt processing ~4 times, they got to some 95% accuracy. So 100 LLMs give an answer, each time we process it 4 times, can we beat a 64 year old doctor? |
|
Even with such a system, which will still have some hallucination rate, adding Deterministic Quoting on top will still help.
It feels to me we are a long way off LLM systems with trivial rates of hallucination