|
Yes, obviously cloud providers get their hardware at a fraction of a cost I'm quoted, they are ordering thousands of servers. I was only buying four. No one would negotiate with me, I tried. I suppose if I had a 7 digit budget I could get a better deal. I was mainly talking about training workloads, inference is a different beast. I'm actually surprised you have 100% inference utilization - customer load typically scales dynamically, so with on-prem servers you would need to over-provision. CEOs don't usually order hardware, they have IT people for that, with input from people like me (ML engineers) who could estimate the workloads, future needs, and specific hw requirements (e.g. GPU memory). And when your people come to you asking for budget, while you're trying to raise the next round, you're more likely to approve the 'no high upfront cost' option, right? In my situation, when asked about buy vs rent my initial reaction was "definitely buy", but when I actually looked at the numbers, the 3 years break even period, no upfront costs for cloud, and no need to provision storage and networking, made it an easy recommendation. The cost of cloud GPUs has come down dramatically in the last couple of years. Though I would like to have at least a couple of local GPU servers for quick experimentation/prototyping, because sometimes the overhead of spinning up a new instance and copying datasets is too great relative to the task. |
We got our "deal" when buying just a single server and have since bought many more with the same provider. We didn't spend 7 figures all at once, we did it piece-meal over time. There is nothing stopping you from getting much better prices.
> I'm actually surprised you have 100% inference utilization - customer load typically scales dynamically, so with on-prem servers you would need to over-provision.
It is pretty easy to achieve 100% inference utilization if you can find inference work that does not need to be done on-demand. We have a priority queue and the lower priority work gets done during periods with lower demand.
> CEOs don't usually order hardware, they have IT people for that, with input from people like me (ML engineers) who could estimate the workloads, future needs, and specific hw requirements (e.g. GPU memory).
Judging by this conversation it seems like "people like you" may not be the best people to answer this question since the best hardware quote you could get was at a >100% markup! At a startup that specializes in ML research and work the CEO is going to be intimately familiar with ML workloads, needs, and hardware requirements.
> And when your people come to you asking for budget, while you're trying to raise the next round, you're more likely to approve the 'no high upfront cost' option, right?
If the break even point is 6-7 months and our runway is longer than 6-7 months why would this matter?