|
|
|
|
|
by tangentstorm
988 days ago
|
|
I tested their first example in ChatGPT 4. Interestingly, it gives the correct answer on the first try: https://chat.openai.com/share/d86fe16a-9dfd-4753-8eaf-6d2948096ea3
I then gave GPT4 a "chain-of-thought" flavored prompt, telling it to treat the problem like a geometry proof. It gave the same incorrect answer as GPT3 did in the paper. I then told it to "Review your work and check for mistakes." With this follow-up, it checked each line of the proof and was able to find and explain the error: https://chat.openai.com/share/c4ce6e98-43e3-4547-a4c8-380c1d1cc5fe
GPT3.5, given the same prompts, was still confident in its incorrect answer: https://chat.openai.com/share/1a0be419-092d-4dbb-a6c3-79e61914fd0d
(edited to update links and ask them each version to check its work twice) |
|
And after that? Did you tell it that it may be wrong and ask it to check again?