Hacker News new | ask | show | jobs
by gravypod 1876 days ago
I've setup multiple CI systems and it really depends on what you need to test. A long time ago I built a CI for a team that ran end to end integration tests and collected code coverage from each service running. This took between 4 minutes and 10 minutes to run for our 3 to 6 services. Another job I worked at I setup a git repo + CI for a team of about 15ish people. In the beginning we had no CI, then I containerized everything and the CI took at long time (~20 minutes). Then, I switched to a build/test system that was more in tune with what we needed and I ultimately (through some hacks) got the entire CI time for ~20ish microservices down to <1 minute since I was caching everything that wasn't changed with Bazel. After that I added in a stage where I collected code coverage from all 20 of those services which was much slower since Bazel had a hard time understanding how to cache that for some reason. This brought it back up to 4ish minutes.

The main blockers I've seen to CI performance is:

1. Caching: Most build systems are intended to run on a developers laptop and do not cache things correctly. Because of this most CIs completely chuck all of your state out of the window. The only CI that I've found that lets you work around this is Gitlab CI (this is my secret for getting <1 min build/test CI pipeline)

2. What you do in CI: If you want to run end-to-end integration tests, it's going to be slow. Any time you're accessing a disk, accessing the network, anything that doesn't touch memory, it's slow. Make sure your unit tests are written to use Mocks/Fakes/Stubs instead of real implementations of DBs like sqlite or postgres or something.

3. The usage pattern: If you don't have developers utilizing your CI machines 100% of the time you are "wasting" those resources. People will often say "lets autoscale these nodes" and, when you do, you'll notice they scale down to 1 node when everyone is asleep, everyone starts work and pushes code, then the CI grinds to a halt. You can make a very inefficient CI just by having the correct number of runners available at the correct time.

Another thing to consider: anything you can make asynchronous doesn't need to be fast. If you setup a bot to automatically rebase and merge your code after code review then you don't really need to think about how fast the CI is.