Hacker News new | ask | show | jobs
by Closi 813 days ago
Humans were terrible at getting calculations right - that's why we invented abacuses, slide-rules, books of mathematical tables and tabulation machines.
1 comments

Humans invented those since we are slow and have limited working memory. But we managed to invent those since we understand how to perform reliable calculations.
Yes, but that acknowledges that there is a difference between understanding how to perform reliable calculations, and actually being able to perform reliable calculations.

Humans are good at the former, but not the latter.

Humans are good at performing reliable calculations with pen and paper. That is the same kind of tools that LLMs works with. I'm not sure why humans can do that but not LLMs, the task should be way easier for an LLM.
> Humans are good at performing reliable calculations with pen and paper.

Speak for yourself. Even though I've always been strong at my conceptual understanding and problem solving in math, I always found it difficult to avoid arithmetic mistakes on pen and paper and could never understand why I was assessed on that. I could have done so much better in high-school math if I was allowed to use a programmable computer for the calculations.

And I think it's the same for LLMs, we should assess them on doing the arithmetic in a single pass, but rather on writing the code to perform the calculation, and responding based on that.

Maybe a lot of people suffer from a degree of dyscalculia, but in my experience if you do it a lot you just stop making mistakes. Not just me, many others I've seen reliably do calculations pretty quick without making errors, you just do everything twice as you go and then arithmetic errors go to basically 0.

But I do acknowledge that there are probably some or many humans that maybe can't reach that level of reliability with arithmetics.

LLMs (internally) don't have a pen and paper equivalent. Their output is the output of their neurons. Like if I was a head on a table with a screen on my forehead that printed out my thoughts as they appeared in my head. Ask (promt) me my favorite color and "green" would show up on the screen.

This is why prompting LLM's to show their steps works so well, it makes them work through the problem "in their head" more efficiently, rather than just spit out an answer.

However, you can give LLM's external access to tools. Ask GPT4 a particularly challenging math problem, and it will write a python script and run it to get a solution. That is an LLM's "pen and paper".

> That is an LLM's "pen and paper".

No, that is an LLM's calculator or programming, it doesn't actually do the steps when it does that. When I use pen and paper to solve a problem I do all steps on my own, when I use a calculator or a programming language the tool does a lot of the work.

That difference is massive, since when I use a calculator that doesn't help me learn numbers and how they interact and how algorithms works, while if I do the steps myself I do. So getting an LLM that can reliably execute algorithms like us humans can is probably a critical step towards making them as reliable and smart as humans.

I do agree though that if LLMs could keep a hidden voice they used to reason before writing they could do better, but that voice being shown to the end user shouldn't make the model dumber, you would just see more spam.

You are spitting hairs on technicalities here. You need to do a lot of "steps" to write a program that solves your question. Debatably even more steps and more complexity than using pen and paper.

Maybe we should be giving the LLM's MS paint instead of python to work out problems? There is nothing unique or "human" about running through a long division problem, it is ultimately just an algorithm that is followed to arrive at a solution.

> That is an LLM's "pen and paper".

No, that's an LLM's Python playground.

An LLM's "pen and paper" is "think step by step" where it gets to see it's own output to keep track of what it is doing.

I'd expect that with appropriate prompting one could get a good model to one/few-shot learn how to do addition this way.

LLM's do that ok too... just not for crazy complex equations that would be tough for most humans with pen and paper.

See below which I have just run on GPT4: https://chat.openai.com/share/3adb3aa2-8aec-474f-bdb0-4d761d...

I know they can do that, but not as reliably as I can for example or typical engineers from 80 years ago. I did engineer exams without a calculator just did all the calculations with pen and paper, didn't make mistakes, just takes a bit longer since calculating trigonometric functions takes a bit but still not a lot of time compared to how much you have.

That was how everyone did it back then, it really isn't that hard to do. Most people today never tried to do it so they think it is much harder than it actually is.