| What I wonder, as a computer scientist: If you want to solve grade school math problems, why not use an 'add' instruction? It's been around since the 50s, runs a billion times faster than an LLM, every assembly-language programmer knows how to use it, every high-level language has a one-token equivalent, and doesn't hallucinate answers (other than integer overflow). We also know how to solve complex reasoning chains that require backtracking. Prolog has been around since 1972. It's not used that much because that's not the programming problem that most people are solving. Why not use a tool for what it's good for and pick different tools for other problems they are better for? LLMs are good for summarization, autocompletion, and as an input to many other language problems like spelling and bigrams. They're not good at math. Computers are really good at math. There's a theorem that an LLM can compute any computable function. That's true, but so can lambda calculus. We don't program in raw lambda calculus because it's terribly inefficient. Same with LLMs for arithmetic problems. |
[1] http://www.incompleteideas.net/IncIdeas/BitterLesson.html
[2] Which the people making these models are familiar with. The whole thing is a trillion+ parameter linear algebra crunching machine after all.