| A Opus 4.7/Gpt5.5 class model is 5 trillion parameters[1]. To run a 8 bit quantized version of that you need roughly 5TB of RAM. Today that is around 18 NVidia B300. That's around $900,000, without including the computers to run them in. It's true that the capability of open source models is improving, but running actual frontier models on your MPB seems a way off. [1] https://x.com/elonmusk/status/2042123561666855235?s=20 (and Elon has hired enough people out of those labs to have a fair idea) |
Today's LLMs are able pack much more capabilities into fewer parameters compared to 2023. We might still be at the very rudimentary phase of this technology there are low-hanging efficiency gains to be had left and right. These models consume many orders of magnitude more energy than a human brain, this all seems like room for improvement.
The right question: is there a law in information theory that fundamentally prevents a 70B model of any architecture from being as smart as Opus 4.7?