|
|
|
|
|
by AnthonyMouse
1144 days ago
|
|
There are workloads where newer CPUs are dramatically faster (e.g. AVX-512), but in general the difference isn't huge. Most of what the newer CPUs get you is more cores and higher power efficiency, which you don't care about when you're paying per-vCPU. Which vCPU is faster, a ten year old Xeon E5-2643 v2 at 3.5GHz or a two year old Xeon Platinum 8352V at 2.1GHz? It depends on the workload. Which has more memory bandwidth per core? But the cloud provider prefers the latter because it has 500% more cores for 50% more power. Which is why the latter still goes for >$2000 and the former is <$15. |
|
It really does not depend on the workload, when those workloads we're talking about are by-and-large bounded to 1vCPU or less (CI jobs, serverless functions, etc). Ice Lake cores are substantially faster than Ivy Bridge; the 8352V will be faster in practically any workload we're talking about.
However, I do agree with this take, if we're talking about, say, lambda functions. The reason being that the vast majority of workloads built on lambda functions are bounded by IO, not compute; so newer core designs won't result in a meaningful improvement in function execution. Put another way: Is a function executing in 75ms instead of 80ms worth paying 30% more? (I made these numbers up, but its the illustration that matters).
CI is a different story. CI runs are only bound by IO for the smallest of projects; downloading that 800mb node:18 base docker image takes some time, but it can very easily and quickly be dwarfed by all the things that happen afterward. This is not an uncontroversial opinion; "the CI is slow" is such a meme of a problem at engineering companies nowadays that you'd think more people would have the sense to look at the common denominator (the CI hosts suck) and not blame themselves (though, often there's blame to go around). We've got a project that can build locally, M2 Pro, docker pull and push included, in something like 40 seconds; the CI takes 4 minutes. Its the crusty CPUs; its slow networking; its the "step 1 is finished, wait 10 seconds for the orchestrator to realize it and start step 2".
And I think we, the community, need to be more vocal about this when speaking on platforms that charge by the minute. They are clearly incentivized to leave it shitty. It should even surface in discussions about, for example, the markup of lambda versus EC2. A 4096mb lambda function would cost $172/mo if ran 24/7, back-to-back. A comparable c6i-large: $62/mo; a third the price. That's bad enough on the surface, and we need to be cognizant that its even worse than it initially appears because Amazon runs Lambda on whatever they have collecting dust in the closet, and people still report getting Ivy Bridge and Haswell cores sometimes, in 2023; and the better comparison is probably a t2-medium @ $33/mo; a 5-6x markup.
This isn't new information; lambda is crazy expensive; blah blah blah; but I don't hear that dimension brought up enough. Calling back to my previous point: Is a function executing in 75ms instead of 80ms worth paying 30% more? Well, we're already paying 550% more; the fact that it doesn't execute in 75ms by default is abhorrent. Put another way: if Lambda, and other serverless systems like it such as hosted CI runners, enables cloud providers to keep old hardware around far longer than performance improvements say it should be; the markup should not be 500%. We're doing Amazon a favor by using Lambda.