| > Also how do you prove that GPT is worse at counting? Back in June 2023 GPT-4 was dramatically worse at counting than a pigeon in the sense that it couldn't accurately tell the difference between sentences with 3 words and sentences with 5 words, whereas pigeons can count almost anything up to about 10. It also routinely failed "pick the shorter sentence" tests which I literally took from a test administered to mice. GPT simply doesn't understand what numbers are, whereas pigeons and mice have an intuitive understanding similar to toddlers. You don't need to teach kids what 3 means, you just need to teach them the human symbol for the concept of 3. GPT only has the human symbol and does not seem capable of understanding the concept. In my testing GPT-4 consistently failed counting / pattern-recognition tests even if you used "chain-of-thought" prompting. As far as I could tell its only true understanding of numbers was "one, two, many." This seems reflected in real use cases, where GPT routinely (and hilariously) ignores commands to return 50 words/etc of output. GPT doesn't know what fifty means, it just knows what various documents that say "word count: 50" look like, and tries to imitate the tone. Since transformer neural networks lack recursion I conjecture that GPT will never be able to understand a number larger than 2, even if in specific cases it can solve counting problems up to eleventy billion. This is what I mean by "counting apples, not oranges," its sense of counting is paper-thin and easily fooled by adversarial prompts. It is much harder to fool a mouse or a pigeon. Many of the tests I ran back in April 2023 no longer work. I strongly suspect this is because OpenAI trained GPT to many of the tests that people were throwing at it, and not because GPT actually became "smarter." I stopped messing around with GPT specifically because OpenAI doesn't issue any release notes, making replicability impossible. Mistrial's 77B model was dramatically worse than even GPT-3 at counting, but I doubt they trained it to count. Not sure about LLaMa/etc. |
E.g. are pigeons actually "counting" as in the process how humans calculate to be accurate? Or are they just responding to the signal? Like similar to how a person could tell whether some sound is higher or lower pitch, but they wouldn't be able to actually numerically say the actual exact frequency.
Because to me pigeons are just similarly responding to the amount of "signal" they are receiving, not actually doing abstract reasoning.
And looking at the science studies, it also seems that they had to train pigeons to be able to count, they weren't able to do it out of the box.