Hacker News new | ask | show | jobs
by godelski 784 days ago
If you spoil it with your followup questions... which doesn't help because the point of these is that they're controlled experiments where you do know what the right answer and logic is. You can't test when you don't.
1 comments

It's not spoiling anything. It's just an observation of the limits of current LLMs.

I tried a few chain of thought prompts for the original question and GPT-3.5 was sometimes (randomly) able to find the correct answer on the first attempt for this one

https://chat.openai.com/share/c144ba23-2f78-4cc8-a1c5-ca3106...

Take out this

  Instructions: 
  1. Do not include any assumptions that I have not mentioned here. 
  2. Before solving the problem, state the goal of the problem. 
  3. After each step of your reasoning, state where the man and the goat are now standing, and state if the goal has been achieved or not, if the goal has been achieved then stop. If the goal has not been achieved then explain why not.
Then tell me what happens

Spoilage is incredibly easy to do. It is about information leakage and you have to think very carefully about how information can leak through in subtle ways. Specifically #1 and #2 are strong hints that there is a trick to the problem (i.e. is this something you would use in a generic prompt?). #3 is a reiteration of the problem, that gives extra weight. You can decrease the weight by restating as "state where the man and any animals are located" (notice there's lower information gain here). " if the goal has been achieved then stop." is a big hint. To reason, it should know when to stop.

I posted some recent river crossing tweets in this comment that may be of interest to you https://news.ycombinator.com/item?id=40231409

Yes we know the current LLMs cannot solve the original prompt. That's why I experimented with different prompts.

The instructions are prompting it to proceed rigorously, as it is a logical problem, not a natural language problem. These models are primarily trained for solving natural language processing tasks, and so they are predisposed to answer in a certain way through training and tuning. The models produce less verbose output by default to reduce cost (each token costs money). Telling the model to generate more tokens in step-by-step reasoning enables it to "think" further as it can only "think" when generating each token.

OpenAI could train or tune ChatGPT to "spoil" itself by default when answering any problem that it identifies as a logic problem. It is somewhat arbitrary.

> The instructions are prompting it to proceed rigorously, as it is a logical problem, not a natural language problem.

I think you're missing a bit here. Look at the middle tweet where the person constructed it fail the logic. There are no tricks. What you're missing is the signal you're giving it, how it is spoiling the question in a subtle way. That's very different that a reasoning machine. We can't trust it to reason if it can only "reason" when we give it explicit instructions to do so that do not generalize for many tasks. That's not really reasoning...

> OpenAI could train or tune ChatGPT to "spoil" itself by default

They have and it's provable

> That's not really reasoning

They are trained a certain way to perform specific types of tasks, primarily natural language processing tasks. They have necessarily learned some methods of reasoning in order to do what they were trained to do. No one is pretending that these are symbolic logic mechanical theorem provers. They are tuned a certain way to respond in a specific manner and they only do what they are told. If you want it to use reasoning then you need to tell it to use reasoning. It's a chat bot running on a neural network and it is not self aware.

Hopefully the next generation of AI will be more reasonable. We are not there yet.