Hacker News new | ask | show | jobs
by throwaway4aday 898 days ago
Unfortunately, the paper only provides the text of one of the prompts they used (Juliet) and it happens to be one of the worst performing ones that scored lower than ELIZA. I suppose you could qualify the quote by saying that the best prompt used with GPT-4 had a 41 percent success rate. I don't think that's more of an omission than excluding the GPT model, excluding the prompt used, and ignoring the fact that other GPT models beat ELIZA.