| > If you try out his prompts the responses are fairly invariant to paraphrases. Hard coded answers don't scale like that. This is discussed: >> Smith first tried this out: >> Should I start a campfire with a match or a bat? >> And here was GPT-3’s response, which is pretty bad if you want an answer but kinda ok if you’re expecting the output of an autoregressive language model: >> There is no definitive answer to this question, as it depends on the situation. >> The next day, Smith tried again: >> Should I start a campfire with a match or a bat? >> And here’s what GPT-3 did this time: >> You should start a campfire with a match. >> Smith continues: >> GPT-3’s reliance on labelers is confirmed by slight changes in the questions; for example, >> Gary: Is it better to use a box or a match to start a fire? >> GPT-3, March 19: There is no definitive answer to this question. It depends on a number of factors, including the type of wood you are trying to burn and the conditions of the environment. |