|
|
|
|
|
by godelski
784 days ago
|
|
Take out this Instructions:
1. Do not include any assumptions that I have not mentioned here.
2. Before solving the problem, state the goal of the problem.
3. After each step of your reasoning, state where the man and the goat are now standing, and state if the goal has been achieved or not, if the goal has been achieved then stop. If the goal has not been achieved then explain why not.
Then tell me what happensSpoilage is incredibly easy to do. It is about information leakage and you have to think very carefully about how information can leak through in subtle ways. Specifically #1 and #2 are strong hints that there is a trick to the problem (i.e. is this something you would use in a generic prompt?). #3 is a reiteration of the problem, that gives extra weight. You can decrease the weight by restating as "state where the man and any animals are located" (notice there's lower information gain here). " if the goal has been achieved then stop." is a big hint. To reason, it should know when to stop. I posted some recent river crossing tweets in this comment that may be of interest to you https://news.ycombinator.com/item?id=40231409 |
|
The instructions are prompting it to proceed rigorously, as it is a logical problem, not a natural language problem. These models are primarily trained for solving natural language processing tasks, and so they are predisposed to answer in a certain way through training and tuning. The models produce less verbose output by default to reduce cost (each token costs money). Telling the model to generate more tokens in step-by-step reasoning enables it to "think" further as it can only "think" when generating each token.
OpenAI could train or tune ChatGPT to "spoil" itself by default when answering any problem that it identifies as a logic problem. It is somewhat arbitrary.