Hacker News new | ask | show | jobs
by hgsgm 1207 days ago
ChatGPT doesn't well understand relationships between numbers. There are far too many of them, compared to words, since every slight perturbation of a number is a different valid number. (Also, I'm not sure if it even treated individua5 digits as separate tokens, but it might. Someone with API access can check.)

To give it a fair shot, you need to describe the problem using logical conceptual vocabulary, not numbers.

3 comments

> Also, I'm not sure if it even treated individual digits as separate tokens, but it might. Someone with API access can check.

Anyone can check, they have a tool for that[1]. It's mentioned in their FAQ article[2].

According to their tool, GPT-3 counts the following as one token:

- any combination of or below 3 digits

- 1111, 3333, 6666, 9999 (it tends to group other digits in groups of 2)

- 66666666 (so 8 sixes -- 5, 6 or 7 won't work)

- 00000000 (anything below 8 zeros counts as one token as well, probably to handle millions and billions)

- 0000000000000000 (16 zeros)

This isn't an exhaustive list, there are probably a lot of other weird edge cases I haven't tried. Its failure to understand basic arithmetic makes much more sense given how inconsistent the tokenizing of digits is done.

[1]: https://platform.openai.com/tokenizer

[2]: https://help.openai.com/en/articles/4936856-what-are-tokens-...

It doesn't understand simple logic puzzles (If X than Y, if Y than Z, X is true, is Z true), either.

Or, rather, it prints out a paragraph of the correct way to solve such a puzzle, and then produces a totally contradictory answer.

When asked:

> Sally will go to the dance if James goes to the dance. Michael will not go to the dance if Julie will go to the dance. James is sick and can’t go to the dance. Julie and James always go to dances together. Will Michael go to the dance?

Its response is:

> Michael will not go to the dance, since his attendance is contingent on Julie not going and James being unable to attend due to illness does not change that condition. Given that James and Julie always attend dances together, it is likely that Julie will also not go to the dance, meaning that Michael’s condition for attending will be met, and he will not go to the dance.

For a beautiful combination of logic and numbers, and the spectacular way in which it explodes, ask it a number-logic puzzle.

> Could you please solve the following number puzzle for each digit?

     AAA
  +  BBB
  --------
  = AAAC
Its response reveals an 'understanding' of what is being asked of it, and the constraints inherent to it, and uses a reasonable approach for problem-solving, but due to the logical errors it commits, it results in utter nonsense.

Nonsense like trying '15' for the value of A, because it thinks that A must be an odd multiple of '5', due to algebraic gems, such as:

  2A + 2B = A + 10C - 11
  Simplifying, we get:
  A = 5C - 6 - B/2
It's idea of math - even math it is using to solve problems - simply consists of building a soup of numbers and letters.
Arithmetic fail kinda makes sense when you look at how the numbers get tokenized. Try this:

https://platform.openai.com/tokenizer

Then imagine how well you'd be able to do even basic math if your representation of numbers was such that 2045 is made up of tokens (20,45) while 2145 is (2,145) and 2005 is just (2005). No wonder that whatever relationships it derived from the training corpus don't generalize well.

Ask it to work through the problem first and write down intermediate steps and only write the answer at the end. You should get better results than "wrong answer, then trying to justify it"
It does work through the problem, both with the logic, and with the number puzzle, providing all the intermediate steps necessary to solve it.

The problem here is that all the intermediate steps have serious mistakes in them. It's like asking a Markov chain to do algebra. There's numbers, and letters, and equals signs, and its all just word soup.

https://pastebin.com/Yy35m6um

You can try the dance problem with this prompt prefix

To answer the following problem, work through it by reasoning step by step and writing that reasoning down, making sure steps are not conflicting with previous steps. Only after you've written down all the steps, write down the final answer and base it on the previous steps.

I tried the dance problem, and regenerated the response three times.

The first two claimed that Michael will go to the dance, but third one made the correct argument that Michael may or may not go to the dance.

It didn't help it do any better on the number problem. Prepending that paragraph still has it get the first step is wrong, as well as everything that follows it.

> To solve this puzzle, we need to find the values of A, B, and C that satisfy the equation:

> AAA + BBB = AAAC

> Let's start by looking at the rightmost digit, which is C. We know that C must be either 0 or 1 because the sum of two digits cannot be greater than 18 (9 + 9 = 18). Also, C cannot be 0 because that would mean that A and B would be equal, which is not allowed in this puzzle. Therefore, C must be 1.

... And then it keeps going into la-la land.

The final answer it gives is, by the way:

957 + 483 = 1440

That's how it seems. That said, this seems like a very tractable problem to fix.