Hacker News new | ask | show | jobs
by theamk 1254 days ago
The author's problem is pretty simple: the test repo is required for pre-merge tests to pass, but it can be updated independently, without having pre-merge tests pass.

And the answer is pretty simple: pin the specific test repo version! Use lockfiles, or git submodules, or put "cd tests && git checkout 3e524575cc61" in your CI config file _and keep it in the same repo as source code_ (that part is very important!).

This solves all of author problems:

> new test case is added to the conformance test suite, but that test happens to fail. Suddenly nobody can submit any changes anymore.

Conformance test suite is pinned, so new test is not used. A separate PR has to update conformance test suite version/revision, and it must go through regular driver PR process and therefore must pass. Practically, this is a PR with 2 changes: update pin and disable new test.

> are you going to remember to update that exclusion list?

That's why you use "expect fail" list (not exclusion) and keep it in driver's dir. Ad you submit your PR you might see a failure saying: "congrats, test X which was expect-fail is now passing! Please remove it from the list". You'll need to make one more PR revision but then you get working tests.

> allowing tests to be marked as "expected to fail". But they typically also assume that the TB can be changed in lockstep with the SUT and fall on their face when that isn't the case.

And if your TB cannot be changed in lockstep with SUT, you are going to have truly miserable time. You cannot even reproduce the problems of the past! So make sure your kernel is known or at least recorded, repos are pinned. Ideally the whole machine image, with packages and all is archived somehow -- maybe via docker or raw disk image or some sort of ostree system.

> Problem #2 is that good test coverage means that tests take a very long time to run.

The described system sounds very nice, and I would love to have something like this. I suspect it will be non-trivial to get working, however. But meanwhile, there is a manual solution: have more than one test suite. "Pre-merge" tests run before each merge and contain small subset of testing. A bigger "continuous" test suite (if you use physical machines) or "every X hours" (if you use some sort of auto-scaling cloud) will run a bigger set of tests, and can be triggered manually on PRs if a developer suspects the PR is especially risky.

You can even have multiple levels (pre-merge, once per hour, 4 times per day) but this is often more trouble than it worth.

And of course it is absolutely critical to have reproducible tests first -- if you come up to work and find a bunch of continuous failures, you want to be able to re-run with extra debugging or bisect what happened.

2 comments

> And the answer is pretty simple: pin the specific test repo version! Use lockfiles, or git submodules, or put "cd tests && git checkout 3e524575cc61" in your CI config file _and keep it in the same repo as source code_ (that part is very important!).

Indeed. Where I work we have a bunch of repos, but they always reference each other via pinned commits. We happen to use Nix, with its built in 'fetchGit' function; it's also easy to override any of these dependencies with a different revision. For example:

  { helpers ? import (fetchGit {
      url = "git://url-of-helpers.git";
      ref = "master";
      rev = "11111";
    })
  , some-library ? import (fetchGit {
      url = "git://url-of-some-library.git";
      ref = "master";
      rev = "22222"
    }) {}
  }:
  helpers.build-a-service {
    name = "my-service";
    src  = ./src;
    deps = { inherit some-library; };
  }
This is a function taking two arguments ('helpers' and 'some-library'), with default arguments that fetch particular git commits. This gives us the option of calling the function with different values, to e.g. build against different commits.

We run our CI on GitHub Actions, which allows some jobs to be marked as 'required' for PRs (using branch protection rules). The normal build/test jobs use the default arguments, and are marked as required: everything is pinned, so there should be no unexpected breakages.

Some of our libraries also define extra CI jobs, which are not marked as required. Those fetch the latest revision of various downstream projects which are known to use that library, and override the relevant argument with themselves. For example, the 'some-library' repo might have a test like this:

  import (fetchGit {
    url = "git://url-of-some-library.git";
    ref = "master";
    # No 'rev' given, so it will fetch 'HEAD'
  }) {
    # Build with this checkout of some-library, instead of the pinned version
    some-library = import ./. {};
  }
This lets us know if our PR would break downstream projects, if they were to subsequently update their pinned dependencies (either because we've broken the library, or the downstream project is buggy). It's useful for spotting problems early, regardless of whether the root cause is upstream or downstream.
Yeah - developers need to control their own tests. If in the weird case they don't control their tests (conformance tests) - you need to control when those tests are added.