Hacker News new | ask | show | jobs
by singlepaynews 249 days ago
Very cool. I jumped in here thinking it was gonna be something else though: a packaged service for distributing on-prem model running across multiple GPUs.

I'm basically imagining a vast.ai type deployment of an on-prem GPT; assuming that most infra is consumer GPUs on consumer devices, the idea of running the "company cluster" as combined compute of the company's machines

3 comments

Great point. I can see how you'd land there. Also a great idea! xD

Maybe a better descriptor is "self-sovereign AI?" "Self-hosted AI?"

Sounds like something that could be implemented with llm-d, though I've not experimented with it.

https://llm-d.ai/blog/intelligent-inference-scheduling-with-...

Yeah, I don't see why we could not integrate that. I think that is the next step as we move our workloads to production.
`lf deploy` here we come!
We're building something closer to this at Muna: https://docs.muna.ai . Check us out and let me know what you think!
Let me know when you open source it; I think there is a place for this and I think we could integrate it as a plug in pretty easily into the LlamaFarm framework :)