| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bradfox2 818 days ago

We own and operate 40+ data center GPUs (v100s, a100s, and ax000s) in a private cluster and use vast to rent unused capacity.

What would make you better than vast is extremely easy spot leasing and job prioritization.

I want to be able to have one of our training jobs finish, and then have the capacity immediately transition to a lease. With vast, we are renting in week long blocks.

4 comments

jcannell 817 days ago

You should be able to do that right now on vast. You just need to rent the gpus yourself with your own on demand instance(s) for your training job. As soon as it finished you then stop or destroy those instance(s) and the GPUs are available immediately (and if there are any other instances queued up in scheduling they will start up). Your actual job doesn't necessarily need to run in the container (if you know what you are doing).

(I'm the founder of vast btw - contact us for help on setting this up and/or any feedback on making it an easier/better process)

link

nicowaltz 818 days ago

Exactly, that's the idea

link

bradfox2 817 days ago

Is it implemented?

link

DeathArrow 817 days ago

Can you expand on spot leasing and job prioritization? What kind of api would you prefer? How would you like to adjust time slices?

link

bigcat12345678 818 days ago

You can do that on llm.sxwl.ai Shoot me an email at z@sxwl.ai for instructions, the web site is pretty outdated, the main UI is through restful API (which we don't have time to write doc yet)

link

mkl 817 days ago

If you have time to answer emails with instructions, you have time to update your site and documentation. Why not just do that?

link

grepfru_it 817 days ago

Don’t build something if you don’t have a use case. All you have is a wishlist, until someone says “yes I want this here is $$$” which I assume the email will facilitate.

link