Y
Hacker News
new
|
ask
|
show
|
jobs
Pool spare GPU capacity to run LLMs at larger scale
(
github.com
)
11 points
by
i386
80 days ago
3 comments
lostmsu
80 days ago
> MoE models via expert sharding with zero cross-node inference traffic
This makes the whole project questionable
link
vagrantJin
80 days ago
This is very promising, definitely looks more user friendly than exo. Can't wait to try it out.
link
iwinux
80 days ago
You lost me on "spare GPU". I don't have any capable GPUs, let alone spare ones :)
link
This makes the whole project questionable