Hacker News new | ask | show | jobs
by ravivyas 1012 days ago
Can you expand on the word "feedback"? What kind of feedback are you looking from CI?
1 comments

It's one thing to know a build failed, it's another thing to know why and how a build failed, where it failed, and what the cause was. If I have to swim through or context shift to a separate build system's output to get that, what is this actually doing for me?
Also, which of my 1,000 test suites are flaky, and when did they start being flaky?

Flamebox almost got this right, but groups by wall-clock instead of git sha:

https://flamebox.sourceforge.net

The screenshot is most of what there is to see. Clicking a test run opens the test log.

Does anyone know of a currently-maintained alternative with a less-bad dashboard?

That looks really cool, thanks for the link. Something like this well integrated into Jenkins would be super useful.

Our DevOps team build something custom: they check all test logs of all builds of all branches and aggregate all that into a Grafana dashboard. We use it to monitor tests that are failing, to get a better grip on flakiness. Works okay but could be better.

From the post:

> Earthly gives you write once, run anywhere build consistency, making it super easy to reproduce CI failures on your local computer.

That doesn't help with non-deterministic failures and I've also yet to see a true "write once, run anywhere" system ever. It may be 99% "write once, run anywhere" but there's always that 1% edge case.
Exactly. This just reminds me of "works on my computer". Building locally is more advantageous for developing CI workflow, but I want to be as close to prod as possible and doing that on snowflake developer workstations is an exercise in futility.
Earthly makes use of BuildKit, which essentially executes the build steps in containers. It provides more isolation from the CI runner / dev workstation. Instead of having developers manage their own build tools, Earthly makes it easy to have the build definition manage them.
All flakiness I’ve observed in code I’ve written has been dominated by things like “this job isn’t getting enough cpu cycles and the test assertion is too aggressive in such a scenario” or “lack of CPU cycles is triggering a race condition” or “statistical test X isn’t written robustly”. Not sure how containers solve these problems. My point being that this tool solves some problems for some teams and maybe for a lot of teams that struggle with this problem, but hard problems remain and this isn’t a silver bullet for that. You can’t outsource stability of the project-specific test infrastructure which is where most of the cost lies these days I think.
Nix solves all these problems for good. It's a monster of a system, but if you're migrating anyways then might as well migrate to the real deal, not to some fly-by-night half-baked thing that gets you halfway there.