| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by singlepaynews 296 days ago
	Very cool. I jumped in here thinking it was gonna be something else though: a packaged service for distributing on-prem model running across multiple GPUs. I'm basically imagining a vast.ai type deployment of an on-prem GPT; assuming that most infra is consumer GPUs on consumer devices, the idea of running the "company cluster" as combined compute of the company's machines

3 comments

mhamann 296 days ago

Great point. I can see how you'd land there. Also a great idea! xD

Maybe a better descriptor is "self-sovereign AI?" "Self-hosted AI?"

link

jochalek 296 days ago

Sounds like something that could be implemented with llm-d, though I've not experimented with it.

https://llm-d.ai/blog/intelligent-inference-scheduling-with-...

link

rgthelen 296 days ago

Yeah, I don't see why we could not integrate that. I think that is the next step as we move our workloads to production.

link

mhamann 296 days ago

`lf deploy` here we come!

link

olokobayusuf 296 days ago

We're building something closer to this at Muna: https://docs.muna.ai . Check us out and let me know what you think!

link

dang 296 days ago

https://news.ycombinator.com/item?id=43119777

link

rgthelen 296 days ago

Let me know when you open source it; I think there is a place for this and I think we could integrate it as a plug in pretty easily into the LlamaFarm framework :)

link