Hacker News new | ask | show | jobs
by rtpg 678 days ago
I really am hopeful we come a bit full circle on builders and machines to "we buy one or two very expensive machines that run CI and builds". Caching in particular is just sitting there, waiting to be properly captured, instead of constantly churning on various machines.

Of course, CI SaaSes implement a lot of caching on their end, but they also try to put people on the most anemic machines possible to try and capture those juicy margins.

1 comments

> we buy one or two very expensive machines that run CI and builds

This unfortunately does not work for orgs that have, say, more than 20 engineers. The core issue is that once you have a test suite large enough to have ~30 shards, you only need one engineer `git push`ing once to saturate those 1-2 expensive machines you've got sitting in the office.

The CI workload is quite amenable to "serverless" when you get to a large enough org size, where most of the time you actually want to pay nothing (i.e. outside your business hours) but when your engineers are pushing code, you want 1500 vCPUs on-demand to run 4 or 5 test suites concurrently.

Sounds like somebody should set up incremental CI with Bazel :)

Seriously though, of course there's a lot of details here, but I think people tend to not really interenalize how much testing is about confidence, and things like incremental CI can really chew away at how big/small your test suite needs to be. There are some things that are just inherently slow, but I've seen a lot of test suites that are mostly rerunning tests that only use unchanged code for most of its runtime.

My glib assertion is that there is likely to be no test suite generated by 20 engineers that requires 30 shards that is impossible to chop up with incremental CI. And downstream of that, getting incremental CI would improve DX a lot, cuz I bet those 30 shards take a long time

incremental CI is absolutely the way to go
I can get a 48 core/96 threads dedicated server for 200€ a month on hetzner. The cheapest EC2 with that comes close costs 2€ per hour. I can get nearly 10 hetzner servers that run consistently for that price.

Obviously the dedicated machines are not rentable per hour, but the cloud is so much more expensive.

Very much this. I’ve overseen this process for one of my clients and we’ve seen build + deploy times go from 5-10 minutes down to around 1-2m. This was down to increased performance, improved caching, and being about to cut out some minor workflow setup steps.

So 10x cheaper and 5x the performance.

Still using GitHub Actions, but now just using self-hosted runners.

Well, the point was that if 4 concurrent `git push`es saturates up to 1500 vCPUs then you'd need 16 of those hetzner dedicated servers (which you have to manage the uptime for) that you're paying for for the entire month. ~4 pushes is a very small amount and an org with a few dozen or so engineers will regularly see peaks higher than this.

Additionally, you'd have to ensure some isolation across your test runs (either by running the test suites in ephemeral containers, or VMs) which is additional engineering work for something that isn't business critical.

In my experience, dedicated hardware has provided a baseline real-world 2x speed up over cloud instances (presumably down to no contention, local nvme). So that would be 8 hetzner instances.

I managed to squeeze out a 5x speed up total (see my other comment). In which case that would mean 3-4 instances.

Plus with shorter build times you may then find that having a builds queued up is acceptable.