Hacker News new | ask | show | jobs
by p1esk 1207 days ago
Yes, obviously cloud providers get their hardware at a fraction of a cost I'm quoted, they are ordering thousands of servers. I was only buying four. No one would negotiate with me, I tried. I suppose if I had a 7 digit budget I could get a better deal.

I was mainly talking about training workloads, inference is a different beast. I'm actually surprised you have 100% inference utilization - customer load typically scales dynamically, so with on-prem servers you would need to over-provision.

CEOs don't usually order hardware, they have IT people for that, with input from people like me (ML engineers) who could estimate the workloads, future needs, and specific hw requirements (e.g. GPU memory). And when your people come to you asking for budget, while you're trying to raise the next round, you're more likely to approve the 'no high upfront cost' option, right?

In my situation, when asked about buy vs rent my initial reaction was "definitely buy", but when I actually looked at the numbers, the 3 years break even period, no upfront costs for cloud, and no need to provision storage and networking, made it an easy recommendation. The cost of cloud GPUs has come down dramatically in the last couple of years.

Though I would like to have at least a couple of local GPU servers for quick experimentation/prototyping, because sometimes the overhead of spinning up a new instance and copying datasets is too great relative to the task.

1 comments

> I suppose if I had a 7 digit budget I could get a better deal.

We got our "deal" when buying just a single server and have since bought many more with the same provider. We didn't spend 7 figures all at once, we did it piece-meal over time. There is nothing stopping you from getting much better prices.

> I'm actually surprised you have 100% inference utilization - customer load typically scales dynamically, so with on-prem servers you would need to over-provision.

It is pretty easy to achieve 100% inference utilization if you can find inference work that does not need to be done on-demand. We have a priority queue and the lower priority work gets done during periods with lower demand.

> CEOs don't usually order hardware, they have IT people for that, with input from people like me (ML engineers) who could estimate the workloads, future needs, and specific hw requirements (e.g. GPU memory).

Judging by this conversation it seems like "people like you" may not be the best people to answer this question since the best hardware quote you could get was at a >100% markup! At a startup that specializes in ML research and work the CEO is going to be intimately familiar with ML workloads, needs, and hardware requirements.

> And when your people come to you asking for budget, while you're trying to raise the next round, you're more likely to approve the 'no high upfront cost' option, right?

If the break even point is 6-7 months and our runway is longer than 6-7 months why would this matter?

the best hardware quote you could get was at a >100% markup!

Now I’m really curious - if you can share - how much did you pay, and when was it? Are you talking about 40GB or 80GB cards? How did you negotiate? Any attempts I made were shut down with simple “no, that’s our final price”. What’s the secret?

At a startup that specializes in ML research and work the CEO is going to be intimately familiar with ML workloads, needs, and hardware requirements.

I work at a startup which builds hardware accelerators, primarily for large NLP models. It’s a large part of my job is to be intimately familiar with ML workloads, needs, and hardware requirements. Our CEO definitely doesn’t have enough of that knowledge to choose the right hardware for our ML team. In fact even most people on our ML team don’t have deep up to date knowledge about GPUs, GPU servers, or GPU server clusters. I happen to know because I always had interest in hardware and I’ve been building GPU clusters since grad school.

As mentioned in another comment, the contract has very clear language not to share it - likely because they are offering different prices to different companies.

So I don't feel comfortable sharing any specifics, especially since this account is directly tied to my name.

With that being said, the negotiation process was pretty straightforward: - Emailed several vendors telling them we are a small startup, we are looking to make many purchases, but right now we are starting with one. We told everyone our purchasing decision was solely based on cost (given equivalent hardware) and to please put your best quote forward.

- Got back all of our prices. Went to the second cheapest one and told them they were beat and offered them the ability to go lower, which they did. We went with that vendor.

- For our next purchase, we went to the original lowest vendor (who got beat out), told them they lost out to price, and if they can go lower than that we would go with them and continue to give them business moving forward. They went quite a bit lower than what they originally offered, and what the vendor we first purchased from gave. We bought our second order from them and have used them ever since.

> We got our "deal" when buying just a single server and have since bought many more with the same provider. We didn't spend 7 figures all at once, we did it piece-meal over time. There is nothing stopping you from getting much better prices.

If it is as easy as you make it sound, why would you not just share the vendor name? I personally would love an 8xH100 machine for transformer experiments, but $100k+ pricing makes it a non-starter.

The contract has very clear language not to share it - likely because they are offering different prices to different companies.

(And as p1esk mentioned, there is no way you are getting H100s for under $100k).

8xH100 machine is ~300k I’ve heard.
Well, the person above claims 8xA100 significantly under $130k. I am curious to hear more.
Sure, but you mentioned H100 machine, and those are about 2.5x more expensive.