Hacker News new | ask | show | jobs
by WJW 2048 days ago
> If you're at the size where slow CI negatively affects your projects, then you're big enough to own your own CI (at least the build agents).

I think you vastly underestimate how much stuff people want to fit into CI and how quickly it turns into a big blob. I work as a freelancer helping not-quite-startups-anymore with things like CI speedup and tuning the database queries emitted by their ORM. You know, things where it's easy to build up technical debt.

It's not uncommon for a team of 3-4 to build so many tests and add so many linters and whatnot that CI takes more than an hour. Often, some basic love can bring it down to ~5 minutes but many teams are so focused on building new features that they will not take time to sharpen their tools.

1 comments

Would you be willing to share some easy improvements that could be made? How can a large amount of tests run so much faster?
It usually comes down to parallelisation; you want to do a much work simultaneously as you can. Saving CI resources _can_ be reasonable if CI resources are very scarce relative to dev time, as in some open source projects. However if you are paying your devs then the extra few hundred bucks a month for beefier CI is often worth the increase in productivity. A couple of different ways:

- Oftentimes the staging of the CI build can be improved. Devs often set up CI so that linters must pass _before_ actual tests are run. Run them in parallel instead and fail the whole run if the linters don't pass. This is even more important if there are multiple linters (perhaps for different sections of the codebase) and they all get applied serially before any of the tests start.

- Obviously, split up your tests as well so they can run in parallel. If you have a project containing both JS and backend tests, don't wait for one to start on the other. Many "bigger" languages also have something akin to parallel_tests (https://github.com/grosser/parallel_tests) that let you quickly set up multiple databases to separate transactions etc. It also provides tooling to remember the output of previous runs and uses that to equalize the parallel tracks of subsequent runs as much as possible.

- Cache as much as possible. This is a wider topic, but dependencies, docker layers and static assets can all be cached and correctly using this alone can hugely cut CI time. You don't want to know how many projects I've seen that don't have this set up correctly (or at all).

- Longer running projects can have hundreds of database migrations and applying them all to an empty database can take minutes. Big frameworks like Rails can dump the schema for you in a way that you can load in a second instead. Have a separate job that runs in parallel and applies all the migrations then verifies the output against the schema, all the other jobs load in the schema and use that.