| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yeldarb 818 days ago
	Love the concept. I've used vast.ai (similar "Airbnb for GPUs" pitch) for years to spin up cheap test machines with GPUs you can't really find in the cloud (and especially consumer-grade GPUs like 4090s). Any insight into how this is different/better?

4 comments

nicowaltz 818 days ago

Main difference is that we are more opinionated (in terms of configurations) and sort of do the scrolling and sorting out for you – hopefully a bit smoother as a user experience. We sort out bad machines immediately. We're also directly working on making compute from unknown high-end data centers available, there's a lot of unused compute out there! See gpulist.ai

Also, don't know if vast.ai does this, but with us you can have 6 user sessions on your machine if you have six GPUs, so granular utilization is possible.

bradfox2 818 days ago

We own and operate 40+ data center GPUs (v100s, a100s, and ax000s) in a private cluster and use vast to rent unused capacity.

What would make you better than vast is extremely easy spot leasing and job prioritization.

I want to be able to have one of our training jobs finish, and then have the capacity immediately transition to a lease. With vast, we are renting in week long blocks.

jcannell 817 days ago

You should be able to do that right now on vast. You just need to rent the gpus yourself with your own on demand instance(s) for your training job. As soon as it finished you then stop or destroy those instance(s) and the GPUs are available immediately (and if there are any other instances queued up in scheduling they will start up). Your actual job doesn't necessarily need to run in the container (if you know what you are doing).

(I'm the founder of vast btw - contact us for help on setting this up and/or any feedback on making it an easier/better process)

nicowaltz 818 days ago

Exactly, that's the idea

bradfox2 818 days ago

Is it implemented?

DeathArrow 817 days ago

Can you expand on spot leasing and job prioritization? What kind of api would you prefer? How would you like to adjust time slices?

bigcat12345678 818 days ago

You can do that on llm.sxwl.ai Shoot me an email at z@sxwl.ai for instructions, the web site is pretty outdated, the main UI is through restful API (which we don't have time to write doc yet)

mkl 817 days ago

If you have time to answer emails with instructions, you have time to update your site and documentation. Why not just do that?

grepfru_it 817 days ago

Don’t build something if you don’t have a use case. All you have is a wishlist, until someone says “yes I want this here is $$$” which I assume the email will facilitate.

cfn 817 days ago

Just a quick comment: The country list in your Add Basic Node Data is not sorted.

nicowaltz 817 days ago

Will fix.

ganoushoreilly 818 days ago

Im also interested in what the differing factor is. Would also like to see more documentation for onboarding rather than just "Ubuntu and root available".

icelancer 818 days ago

vast you have to choose specific machines. gpudeploy routes to whatever resources are available.

vast has a lot of bad machines with terrible PCIe lanes and architecture you have to learn the hard way. Someone on HN wrote a script to run a test docker image on every machine and auto-tagged the machines' quality using their API, which is what I'd do if I was going to use vast seriously for compute.

lelanthran 817 days ago

>> Im also interested in what the differing factor is.

> vast has a lot of bad machines with terrible PCIe lanes and architecture you have to learn the hard way.

Wouldn't gpudeploy have exactly the same problem? How is it mitigated with gpudeploy?

everforward 817 days ago

I think it's more of a business strategy issue than a technical one.

I suspect it would be trivial for Vast or GPUDeploy to spin up a benchmarking job before allowing sales on that machine. I'm not an expert on PCIe lanes, but I would think the performance issues would be visible via bandwidth or latency on the lanes.

It kind of makes sense to me, though. If I were looking for absolute reliability and was willing to pay for it, I'd just go to one of the many GPU cloud vendors. Likewise, I suspect anyone willing to really work on getting good performance would rather be a real provider or sub-provider than being part of this nebulous C2C GPU cloud.

polygot 810 days ago

Do you have a link to that thread/Docker image? I would be very interested using it

icelancer 796 days ago

I don't, sorry. I would love to use it as well, I should've bookmarked it! Also, I'm not sure the person opensourced it.

firloop 818 days ago

I use vast.ai somewhat often. It's great!

tehsauce 818 days ago

+1 for vast. they usually are the cheapest and have the most supply. some instances can be less reliable at the low end though