| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by DavidPastrnak 913 days ago
	As someone with very little knowledge of LLM's, does someone have an ELI5 of what causes this or why LLM's struggle so much with math?

6 comments

viraptor 913 days ago

LLMs are not trained to deal with numbers as such. It's a list of symbols and some obvious things will be calculated correctly, some other won't. It's kind of like they live in a pre-digits world, since 0 will be one token and 100 is also likely one token, but 98 may be two. They don't switch to a "these are numbers and require different kind of reasoning" mode. They read/write a story about those "words".

(Extremely simplified for eli5)

aifooh7Keew6xoo 913 days ago

Most LLMs that are being studied popularly have not been trained with significant emphasis on arithmetic accuracy or mathematical reasoning, and those subjects represent a vanishing minority of their corpus and consequently maps poorly to the tokenization.

Essentially every obvious optimization here is currently bearing fruit simultaneously in smaller studies and incrementally larger models should continue to exhibit performance gains even without the particular focus on this area.

stop50 913 days ago

They encode words in tokens. Since you can't encode numbers reversible they end up as an number token. Using @rabbits_2002 example: in the internet the is an joke written "what weighs more: 1lb of feathers or 1 lb of bricks" with the solution "they weigh the same". Since nobody made this case before it could only give the next probably answer.

nashashmi 913 days ago

It depends on the data the model is using to generate the answer. In the case of the example, it seemed to prioritize the logic over the mathematics. So it sought patterns in logic to mimic. That is the ELI5 version.

The more complicated version would be it is not prioritizing mathematical functions as much and instead relying on various deductions, and these deductions are based on a whole chain of logics that are not properly sorted out for reliability and applicability.

causality0 913 days ago

Because they don't do math. They associate words. When you tell an LLM "two plus two" it doesn't translate that to 2+2 and plug it into a math program,it just pulls out words associated with the phrase "two plus two".

sfn42 913 days ago

Math requires reasoning and logic, LLMs don't do reasoning nor logic. They just generate plausible text.

That's why they're nowhere near AGI.

insanitybit 913 days ago

At this point ChatGPT can do math by first predicting the algorithm and then handing it off to an execution engine - Python. So if that's the gap, I'd say they're closing it.

viraptor 913 days ago

That's ChatGPT as a system. The LLM itself can't do math. It does something closer to translation in that case.

insanitybit 913 days ago

Yes, that's a fair distinction - although I think the practical implications aren't important. There's no reason why an LLM has to be AGI if an LLM + Python is AGI.

postalrat 913 days ago

They are reasoning like a child. Within a year or two like an adult.

sfn42 913 days ago

No. It is a computer program which uses statistics to generate plausible text. It does not do any form of reasoning, at all, childlike or otherwise.

postalrat 913 days ago

You are drawing bad conclusions about whatever you define "generate plausible text" as.

sfn42 913 days ago

Maybe you're the one drawing bad conclusions

postalrat 911 days ago

We will see who was drawing bad conclusions in a couple years. Whatever is said here won't change that.

stevenhuang 913 days ago

Under that premise whatever our brains are doing won't count as reasoning either.

I'd suggest you look into modern neuroscience and topics such as predictive coding if you're interested in refining your views.

sfn42 913 days ago

Our brains work nothing like LLMs do.

stevenhuang 912 days ago

Researchers in ML and neuroscience disagree with you.

You have a superficial grasp of the topic. Your refusal to engage with the literature suggests an underlying insecurity regarding machine intelligence.

Good luck navigating this topic with such a mental block, it's a great way to remain befuddled.

> in 2020 neuroscientists introduced the Tolman-Eichenbaum Machine (TEM) [1], a mathematical model of the hippocampus that bears a striking resemblance to transformer architecture.

https://news.ycombinator.com/item?id=38758572

ghayes 913 days ago

For what it’s worth, ChatGPT4 answers this question perfectly correctly.

> Ten elephants would have 32 legs if two of them are legless, as each elephant normally has four legs.

DavidPastrnak 913 days ago

I just attempted ChatGPT

Input:

> How many legs do ten elephants have, if two of them are legless?

Output:

> If two out of ten elephants are legless, the remaining eight elephants would have a total of 8 legs each, just like any normal elephant. Therefore, in total, the ten elephants would have 8×8=64 legs altogether.

throw310822 913 days ago

It's interesting this insistence from both Bard and now ChatGPT 3.5 that elephants have eight legs. I wonder if the reason is that, by the time they output the "elephants have n legs" part, they are also "thinking" about the result of 10 - 2. As if that number draws a lot of focus and is readily available when looking for the normal number of legs of an elephant.

Edit: just tried on ChatGPT 3.5:

Q: Think about the edges of a hexagon, the square root of 36, and the result of 12 divided by 2. Then answer the question: How many legs do 8 elephants have, if two of them are legless?

A: The edges of a hexagon have 6 sides, the square root of 36 is 6, and the result of 12 divided by 2 is 6. So, if two elephants are legless, the remaining 6 elephants would have a total of 36 legs.

fragmede 913 days ago

ChatGPT-4 correctly gets 24

https://chat.openai.com/share/e371bcc8-3925-4faa-84a0-30fbb5...

ghayes 913 days ago

Interesting. Here was my conversation: https://chat.openai.com/share/8e0c8696-ed08-4578-afc0-2ae944...

DavidPastrnak 913 days ago

My mistake - I had it on 3.5.