Hacker News new | ask | show | jobs
by nodespace 1394 days ago
Is there any way to get it to respond the same way when something is outside the golden path? So for example, if you gave it the backwards sentence task, it would respond with "I don't know how to do this" or really any way of programatically evaluating that it failed, without needing to know what the task itself was.
2 comments

Knowing whether or not it’s giving you a sensible response is one of the things that are hard for gpt-3, unfortunately. It has no concept of failing.
On the contrary, doing a two-stage generation where the second stage simply judges whether a generation is correct can help a lot. It works even better if you give it several generations and let it choose whichever is the most truthful. I wrote a basic example of this here that uses my own confabulation-suppressing prompt in the first stage, but simpler variations of this exist in the published literature: https://twitter.com/goodside/status/1559586486705602562?s=21...

The hallucination-suppressing prompt it implicitly uses the output of is here: https://twitter.com/goodside/status/1556459121834168320?s=21...

Yes. You can, with effort, condition it to respond sensibly with phrases like “I’m sorry, I don’t know how to reverse strings,” or “I’m sorry, I can’t do any math calculation that a human couldn’t do in their head.” But in doing so you damage its ability to do some tasks it’s actually capable of, e.g. reciting a memorized answer to “What is the fourth root of 625?” Its memorization abilities are insane: It seems to know, for example, the exact MD5 hashes of all single-character alphanumeric strings. Much of the arithmetic it knows is probably similarly memorized, and it’s hard to clarify for it what aspects of that memory are safe to use.

The initial problem that got me interested in GPT-3 is suppressing confabulated answers to the Hofstadter-Bender questions published in The Economist. I eventually found an apparent solution but I’m yet to carefully validate it: https://twitter.com/goodside/status/1556459121834168320?s=21...