Hacker News new | ask | show | jobs
by GauntletWizard 3640 days ago
I disagree. Not that I don't think that simple, reusable components are valuable; They are, and any developer should run tests before sending things off to the cloud. But having things work on developers machines is itself a code smell, because developers have access to them.

Your whole project should build straight from very fresh boxes, and doing builds on developers machines will never be fresh boxes. It's hard to remove the cult knowledge from a development team. My project just discovered a new unlisted dependency when doing a deploy, because every developer knew about it and installed it on their machine beforehand. It was explicitly listed in the build/devtools dependencies, but not supposed to be a runtime dep. Had the developers run tests on a fresh machine, they'd have run into it. (Of course, the CI team had also installed it on the CI boxes, because they used it for debugging.)

For a large class of tests, having the developers run them in the build environment is fine. But you also need to run them in the deploy environment, and to that end developers should hit the CI system. Every test that the CI system runs on each commit should be runnable before the commit. You should have tooling and spare capacity such that the CI system is used to run tests immediately before commit, not right after - That's too late. You should run them whenever a dev sends off for code review. You should run whenever a dev feels like it; If they're in a good spot, run the unit tests, run the CI tests, see what's broken.

1 comments

Your comments are fantastically correct. Yes, secret dependencies are the worst. We use an OS sandbox to prevent access outside to non-declared dependencies. That same sandbox prevents access to the network unless a build or test target explicitly declares the need to use networking (e.g. for running tests against network services running on local host).

CI runs the same exact build system (though with a few different options so the outputs are easier to during and after the build).

Passing CI is compulsory, as humans aren't allowed to release changes on our team. Humans may only do code review. If and when a change passes code review, it will be deployed automatically once it passes CI.

We use some of the same compute capacity that our CI system uses to scale test runners across many physical machines (though tests run against a pool of freshly cloned VMs using delta disks so we get a pretty big speedup and lots of control over the environment that tests run in).

There's a fascinating correlation between developer machines and build slaves. It's been my experience that needing to install system software of any kind on one usually leads to a headache later. We've gotten it down to just Xcode on OS X and almost just build-essential on Ubuntu.

So in spirit we do exactly what you're saying, we've just found a way to do it while using the same tooling on both CI and developer machines. We also demand that the build slave images are generated straight from install media and a fixed set of files (like those that install Xcode), so the only simple way to add dependencies (i.e. build tools or libraries) is via our build system. Use of apt, homebrew, etc is completely separate for our developers. And if they mess with the build system in a way that allows those files to leak in, the fact that build slaves are pristine means that their change will fail CI and never be deployed.

Does my explanation make sense? Happy to answer follow up questions. Also happy to be shown where our rigor is lacking :)

I forgot to mention that using the same build tool for both CI and developers has a number of other advantages, including artifact caching. When a developer downloads a change, their build will pull artifacts from the caches that build slaves have populated. So in many cases new changes (once deployed) are only built once, across the whole company by the build slave that kicks off the deploy. Everyone just reuses that cached output.

It was this sharing of artifacts that provided some of the impetus to use a sandbox, since a polluted output could poison the cache in hard to detect ways.