Hacker News new | ask | show | jobs
by ksaj 1146 days ago
You can't ask ChatGPT to count something and expect that it can answer correctly, because it does not have counting logic. It is a language model, not a math model. People use this to "prove" hallucinations, but when you ask it something that is within it's programmed abilities, you get something at least close to what you want.

Having said that, here are the words ChatGPT gave me for the same prompt:

Magi Nagi Sagi Yagi Adagi Galagi Tegagi Sigikagi Tagi Wagagi

It missed Unagi, surprisingly. But it is still leagues ahead of the response primordialsoup got from Lamini.

2 comments

It's true that ChatGPT is not designed for counting and struggles with it in general.

But my point was that ChatGPT, like any tokenized LLM, doesn't even have the concept of letters. The prompt "how many e's in this sentence" is rendered as the tokens [4919, 867, 304, 338, 287, 428, 6827]. There just isn't a pathway for it to consider the letters that make up those tokens.

I'm a little surprised it did that well on your prompt, which is rendered as [10919, 2456, 886, 287, 556, 72]. The interesting thing here is that 556 = " ag" (with leading space) and 72 = "i". So I'm not sure how got to those words. "Wagagi" is tokens [54, 363, 18013], so somehow it is seeing that token 18013 is what you get when you combine 556 and 72? That seems really weird.

I'd love clarification from someone deeper into LLMs and tokenization.

This is an excellent question. I wonder if it's something like [1] on letter composition rather than meaning.

[1] https://arxiv.org/pdf/1810.04882.pdf

In a prompt, can you just tell the model which letters make up each token? Eg a list of ag = a g etc. I imagine a dictionary of that for all tokens in the training data would help.
Maybe? Individual letters are tokens, so you could say something like 3128 = 56 + 129, but the problem is that 3128 is processed as text, not the integer token ID. So the tokenizwr would turn 3128 into a series of tokens.

Intuitively I think there's an abstraction barrier there, but I'm not positive. It feels like asking us to list all of the words that trigger particular neurons.

Chat GPT does have counting logic. The math model is encoded inside of the language model.
This needs citation. These are not the same things. It will get numerical references right if it has sources used in the model, but it isn't doing any numerical calculations.
Just look at any papers that put models through mathematical benchmarks. The model isn't memorizing these problems. For example I just generated 2 random 64 bit integers and asked ChatGPT to add them.

"6769545085823578960 + 16027170449476717488"

ChatGPT said the answer is 22796715535300296448. It got the correct answer even though the problem wasn't in its training data.

Yep, as always, people (and LLMs) take stuff for granted because they read it somewhere months ago. That’s why we are doomed; everyone believes anything without question if it’s not against their personal agenda.
This need citation :) It does numerical calculations, at least in GPT-4 mode, tested. It can do simple arithmetic, and even has sort of 'imagination', or impression of it. I asked it to imagine a room with 4 colored balls at the corners. Then asked about the view angles between some pairs of balls as if looking from the center of the room, and from other balls. It gave the answers with explanations.

This doesn't mean it's always correct, or can be trusted without verification.

I can feed ChatGPT code that does calculations (and have) and have it calculate the right answers. It also gets it wrong a lot, so it's not good at that, but any notion that it can't do numerical calculations is easy to disprove.