That's an interesting rebuttal if you can suggest near-future architectures which don't require their own nuclear power plants to reliably calculate 13 x 54.
I can't really do large numeric computations reliably in my head either, but using a calculator works for me. Maybe let the LLM use a calculator?
It seems to me that we actually already have this and it works great. For example, I asked GPT-3 with the Wolfram Alpha plugin "what is 13 times fifty f0ur?" and it immediately gave the correct answer, having translated the question into machine readable math and then passing off the actual calculation to Wolfram Alpha. Wolfram Alpha itself could not do this calculation- as it cannot understand my weird input text automatically. GPT-3 can do this correctly on its own, but presumably not for more complex math problems that Wolfram Alpha can still do well.
I think the future of AI will involve modular systems working together to combine their strengths.
I’m certain you’re joking here, but I wanted to add that multiplying a few digits is learned naturally from data without any trouble. Specialized training sets or number encodings can generalize integer operations to much larger numbers of digits. However, an infinite number of digits is not possible. Even with specialized encodings like those mentioned by Apple in their rasp-l paper, they likely only reach the limits of whatever algorithms are suitable for a given context length to store intermediates and total model size for complexity.
They're already operating on an architecture that can do that for about a nanojoule.
You can also just ask them to write code for you, which appears to be what ChatGPT does now — it has its own python environment, I'm not sure what's in it except matplotlib and pandas, but it's at least that.
I don't know if it's unique to my use case (research), but I haven't had much luck getting ChatGPT to develop useable code. At best, it seems like it's useful for identifying packages to research to solve the problem. Maybe my prompts just need improving.
My experience is that the quality varies wildly by task.
As an iOS dev, I certainly wouldn't call it "expert", but it's generally "good enough" to be a starting place whenever I get stuck, and on several occasions has surprised me with a complete bug free solution. Likewise when I ask it for web app stuff, though as that isn't my domain I wouldn't be able to tell you if the answers were "good" or "noob".
I do also have custom instructions set, but the critical thing here is the link to the python script, which is linked to at the end of the message, the blue text that reads: [>_]
It seems to me that we actually already have this and it works great. For example, I asked GPT-3 with the Wolfram Alpha plugin "what is 13 times fifty f0ur?" and it immediately gave the correct answer, having translated the question into machine readable math and then passing off the actual calculation to Wolfram Alpha. Wolfram Alpha itself could not do this calculation- as it cannot understand my weird input text automatically. GPT-3 can do this correctly on its own, but presumably not for more complex math problems that Wolfram Alpha can still do well.
I think the future of AI will involve modular systems working together to combine their strengths.