Hacker News new | ask | show | jobs
by Maxious 409 days ago
> prima.cpp is a distributed implementation of llama.cpp that lets you run 70B-level LLMs on your everyday devices— laptops, desktops, phones, and tablets (GPU or no GPU, it’s all good). With it, you can run QwQ-32B, Qwen 2.5-72B, Llama 3-70B, or DeepSeek R1 70B right from your local home cluster!

https://github.com/Lizonghang/prima.cpp