|
|
|
|
|
by Aurornis
94 days ago
|
|
It wasn't considered impossible. There are examples of large MoE LLMs running on small hardware all over the internet, like giant models on Raspberry Pi 5. It's just so slow that nobody pursued it seriously. It's fun to see these tricks implemented, but even on this 2025 top spec iPhone Pro the output is 100X slower than output from hosted services. |
|