Hacker News new | ask | show | jobs
by kami8845 1528 days ago
A few issues I have with this blog post:

1. It doesn't show off the unique capabilities of firecracker very well.

2. The comparison not very fair.

2a. The docker-build step (which dominates the runtime) is run without any caching, just by adding 2 lines to your build-push-action, "cache-from: type=gha, cache-to: type=gha,mode=max" you can make it a lot faster.

2b. ~1m20s of the time is just "VM start". GitHub Actions has had a rough time recently, but you should never wait that long to get your CI running in day-to-day operation.

2c. The tests are unrealistically short at 20s which allows the author to get to their 10x faster number.

Let's say the GitHub Action starts in 5 seconds, the GitHub Actions cache reduces the build time to 2 minutes and the tests take 10 minutes to run. Now Firecracker is 20% faster ...

You can also get comparable performance out of https://buildkite.com/ which lets you self-host runners on AWS meaning you're almost guaranteed to get a hot docker cache (running against locally attached SSDs). You can now start running your tests (almost) as fast with much more mature tooling.

3 comments

> You can also get comparable performance out of https://buildkite.com/ which lets you self-host runners on AWS

you can self-host github runners as well, with a few caveats, the most serious one being that then you are responsible for cleaning up the state of your self-hosted runner between runs

https://docs.github.com/en/actions/hosting-your-own-runners/...

structural isolation guarantees of the form "build execution during run N cannot possibly impact build execution of run N+1" are tremendously helpful -- they reduce the number of weird CI failures and the cost to triage and fix each weird CI failure (by reducing the space of possible interactions). If you cannot offer similar guarantees when self hosting your own CI infrastructure then it may not be wise to self host.

> structural isolation guarantees of the form "build execution during run N cannot possibly impact build execution of run N+1" are tremendously helpful

Yes although the flip side is that it can make caching much less efficient. My experience is that caching layer can often take several minutes to download, dominating CI run time, whereas this might be instant on a self-hosted box that just leaves the previous build artefacts in place.

I tried to get docker layer caching working within GHA for a second benchmark, but it seems like none of the approaches work particularly well for a "docker-compose build" - I'd happily amend the post with a second benchmark if you wouldn't mind opening a PR based on the existing one [1]

https://github.com/webappio/livechat-example/blob/be7c9121c1...

The point still stands for 2c - you can super easily parallelize with firecracker (by taking a snapshot of the state right before the test runs, then loading it a bunch of times)

I did something similar w/VMware nearly 8-10 years ago, just separate storage from OS in your thinking and automate provisioning as needed.

It’s possible to do something similar with ec2 but if you want speed then move your build farm on-prem to utilise storage provider technologies eg https://tintri.com/blog/. Portworx+Stork would probably get you most of the way if you went down the k8 route but not something I’ve seen or look at in detail.

In my experience if your OS boot time is your main bottle neck then you are likely over optimising before expanding your testing environments. Fun but not the best use of development time, it is easier to have a minimum amount of system online before they are needed and just attach new storage to them as needed.

I have to mention GitLab here. Their runners are extremely easy to self-host.