|
|
|
|
|
by yatz
762 days ago
|
|
Once you correct the LLM, it will continue to provide the corrected answer until some time later, when it will again make the same mistake. At least, this has been my experience. If you are using LLM to pull answers programmatically and rely on their accuracy, here is what worked for the structured or numeric answers, such as numbers, JSON, etc. 1) Send the same prompt twice, including "Can you double check?" in the second prompt to force GPT to verify the answer.
2) If both answers are the same, you got the correct answer.
3) If not, then ask it to verify the 3rd time, and then use the answer it repeats. Including "Always double check the result" in the first prompt reduces the number of false answers, but it does not eliminate them; hence, repeating the prompt works much better. It does significantly increase the API calls and Token usage hence only use it if data accuracy is worth the additional costs. |
|
That is only true if you stay within the same chat. It is not true across chats. Context caching is something that a lot of folks would really really like to see.
And jumping to a new chat is one of the core points of the OP: "I restarted with a slightly modified prompt:"
The iterations before where mostly to figure out why the initial prompt went wrong. And AFAICT there's a good insight in the modified prompt - "Make no assumptions". Probably also "ensure you fully understand how it's labelled".
And no, asking repeatedly doesn't necessarily give different answers, not even with "can you double check". There are quite a few examples where LLMs are consistently and proudly wrong. Don't use LLMs if 100% accuracy matters.