Hacker News new | ask | show | jobs
by jstanley 661 days ago
> ChatGPT ended up needing Python tooling to reliably calculate 2+2.

This is untrue. ChatGPT very reliably calculates 2+2 without invoking any tooling.

2 comments

Nit, it predicts that it is the token '4'.

Token frequency in pre-training corpus and the way tokenization is implemented impacts arithmetic proficiency for LLMs.

OpenAI calls this out in the GPT4 technical report.

You can see this by giving it broken code and seeing what it can predict.

I gave copilot a number of implementations of factorial with the input of 5. When it recognized the correct implementations, it was able to combine the ideas of "factorial", "5", and "correct implementation" to output 120. But when I gave it buggy implementations, it could recognize they were wrong, but the concepts of "factorial", "5", and "incorrect implementation" weren't enough for it to output the correct wrong result produced. Even when I explained its attempts to calculate the wrong output was itself wrong, it couldn't 'calculate' the right answer.

This makes very little sense (as a contrast to chatgpt predicted that the likely continuation of factorial and 5 is 120).

Perhaps if you are able to share the chat session it's possible to see if you likely confused the issue with various factorial implementations - or got chatgpt to run your code with 5 as input?

I mean the code is redundant:

https://chatgpt.com/share/be249097-5067-4e3d-93c7-3eebedb510...

Do a google search with 'before:2020' on that code, that is recall from pre-training, not 'calculating'
I misread gp's comment, we're in agreement.
Sure, but I think you get my point.