Hacker News new | ask | show | jobs
by gwern 1205 days ago
If you tweaked inner-monologue prompts to specify delimiters like pipes, then you could presumably parse it before showing to the reader.

It is possible that Bing Sydney is doing this or something like that based on the PM's tweet: https://twitter.com/MParakhin/status/1632087709060825088

---

One approach here would be prompt injection: just insert the 'No' into your own response so ChatGPT tries completing that. Also:

> I speculate that the temperature, when coupled with the mechanism of generating text based on already-generated text, could explain some cases of ChatGPT stupidity. In cases when ChatGPT should be perfectly accurate, the temperature will surely under-optimize its cleverness, and now the entire conversation is broken, because everything else will depend on what foolishness it just wrote.

Absolutely. This is why 'best-of' sampling (not available in ChatGPT's default interface) can be so useful. You decode many different possibilities in parallel, and the ones where the random decoding makes a fatal error will get discarded and you'll get back the most plausible overall one, which is much more likely to be correct.