| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tangentstorm 988 days ago

I tested their first example in ChatGPT 4.

Interestingly, it gives the correct answer on the first try:

  https://chat.openai.com/share/d86fe16a-9dfd-4753-8eaf-6d2948096ea3

I then gave GPT4 a "chain-of-thought" flavored prompt, telling it to treat the problem like a geometry proof. It gave the same incorrect answer as GPT3 did in the paper. I then told it to "Review your work and check for mistakes." With this follow-up, it checked each line of the proof and was able to find and explain the error:

  https://chat.openai.com/share/c4ce6e98-43e3-4547-a4c8-380c1d1cc5fe

GPT3.5, given the same prompts, was still confident in its incorrect answer:

  https://chat.openai.com/share/1a0be419-092d-4dbb-a6c3-79e61914fd0d

(edited to update links and ask them each version to check its work twice)

1 comments

YeGoblynQueenne 988 days ago

>> I then told it to "Review your work and check for mistakes." With this follow-up, it checked each line of the proof and was able to find and explain the error:

And after that? Did you tell it that it may be wrong and ask it to check again?

link

tangentstorm 988 days ago

Good point. I did now. Both models stuck with their second answer.

(I see now I didn't know how to share ChatGPT links properly. Updating the links now...)

link