| Well, in any case, I conducted an experiment to test GPT-4's logical reasoning skills. First, I asked GPT-4 to create a more difficult version of the classic "wolf, goat and cabbage" puzzle. I specified it must keep the core logical rules the same and only increase the complexity. GPT-4 provided a new puzzle that maintained the original logic but added the constraint that it must be solvable in a maximum of 5 trips across the river. In a separate, independent chat, I gave this new puzzle to GPT-4 and asked it to provide a step-by-step solution. It output an answer. Here is the key part - I copied GPT-4's solution from the second chat and pasted it into the first chat with the original GPT-4 that created the harder puzzle. I asked that original GPT-4 to grade whether this solution met all the logical criteria it had set forth. Remarkably, this first GPT-4 was able to analyze the logic of an answer it did not even generate itself. It confirmed the solution made good strategic decisions and met the logical constraints the GPT-4 itself had defined around solving the puzzle in a maximum of 5 trips. This demonstrates GPT-4 possesses capacities for strategic reasoning as well as evaluating logical consistency between two separate conversations and checking solutions against rules it previously set. https://chat.openai.com/share/996583dd-962b-42a8-b4b9-e29c59... |
Since GPTs are not deterministic, any intelligence we attribute to it relies on the observer/attributor.
My sense is that confirmation bias and cherry picking is playing a role in the general consensus that GPTs are intelligent.
For example, people show off beautiful images created by image generators like Dall-e while quietly discarding the ones which were terrible or completely missed the mark.
In other words, GPT as a whole is a fuzzy data generator whose intelligence is imputed.
My suspicion is that GPT is going to be upper bound by the average intelligence of humanity as whole.