|
|
|
|
|
by rahen
364 days ago
|
|
No need for an RPi 5. Back in 1982, a dual or quad-CPU X-MP could have run a small LLM, say, with 200–300K weights, without trouble. The Crays were, ironically, very well suited for neural networks, we just didn’t know it yet. Such an LLM could have handled grammar and code autocompletion, basic linting, or documentation queries and summarization. By the late 80s, a Y-MP might even have been enough to support a small conversational agent. A modest PDP-11/34 cluster with AP-120 vector coprocessors might even have served as a cheaper pathfinder in the late 70s for labs and companies who couldn't afford a Cray 1 and its infrastructure. But we lacked both the data and the concepts. Massive, curated datasets (and backpropagation!) weren’t even a thing until the late 80s or 90s. And even then, they ran on far less powerful hardware than the Crays. Ideas and concepts were the limiting factor, not the hardware. |
|
A "small Large Language Model", you say? So a "Language Model"? ;-)
> Such an LLM could have handled grammar and code autocompletion, basic linting, or documentation queries and summarization.
No, not even close. You're off by 3 orders of magnitude if you want even the most basic text understanding, 4 OOM if you want anything slightly more complex (like code autocompletion), and 5–6 OOM for good speech recognition and generation. Hardware was very much a limiting factor.