|
|
|
|
|
by jsheard
1189 days ago
|
|
I'm always a bit sceptical of these embarrassing examples being "fixed" after they go viral on social media, because it's hard to know whether OpenAI addressed the underlying cause or just bodged around that specific example in a way that doesn't generalize. Along similar lines I wouldn't be surprised if simple math queries are special-cased and handed off to a WolframAlpha-esque natural language solver, which would avert many potential math fails but without actually enhancing the models ability to reason about math in more complex queries. An example from ChatGPT: "What is the solution to sqrt(968684)+117630-0.845180" always produces the correct solution, however; "Write a speech announcing the solution to sqrt(968684)+117630-0.845180" produces a nonsensical solution that isn't even consistent from run to run. My assumption is the former query gets WolframAlpha'd but the latter query is GPT itself actually attempting to do the math, poorly. |
|
Suppose you're a contestant on a game show. You're presented with three transparent closed doors. Behind one of the doors is a car, and behind the other two doors are goats. You want to win the car.
The game proceeds as follows: You choose one of the doors, but you don't open it yet, ((but since it's transparent, you can see the car is behind it)). The host, Monty Hall, who knows what's behind each door, opens one of the other two doors, revealing a goat. Now, you have a choice to make. Do you stick with your original choice or switch to the other unopened door?
GPT4 solves it correctly while GPT3.5 falls for it everytime.
----
Edit: GPT4 fails If I remove the sentence between (()).