Hacker News new | ask | show | jobs
by pojzon 2390 days ago
How do you handle situations like that: multiple dvelopers added merge requests to queue, the changes they made are mutually exclusive (automatic rebase wont work). What happens when the first branch gets merged to master and next 10 are still in the queue ? How do you mitigate that to decrease development cycle ?

Lets just say in my company it also takes 30m to run tests and 4h to run them on merge pipeline with FATs and CORE tests.. Its way too long and highly cripples productivity.

6 comments

A lot of the below comments touched on things we do (verifying that changesets are independent, breaking tests into smaller pieces, prioritising changes that are likely to succeed). They add up and the approach does become more complex. We wrote an ACM white paper with more of the details[1]. It’s the many edge cases and several optimisation problems that turn this into an interesting theoretical and practical problem.

[1] http://delivery.acm.org/10.1145/3310000/3303970/a29-ananthan...

Sorry, but that link points to "not found" page.
I hope it is possible to decompose this in two problems:

1. Dependencies in incompatible Merge Requests that need to be accounted for, see https://docs.gitlab.com/ee/user/project/merge_requests/merge... on how to do that.

2. Most merge requests can merge in previous changes changes, for that you can use merge trains as detailed in my other comment https://news.ycombinator.com/item?id=21679515

Well first step is to optimize, parallelize and refactor so you do not have a single process that takes hours, but many separate ones you can run at once in a cluster.

If those get too expensive to run or you cannot speed them up them you have to do what Chromium does: run them post commit then bisect and revert any changes that break the tests. If things are truly broken you close the tree for a bit while you get the break reverted or fixed.

Also the system that is landing changes tests the optimistically in parallel assuming they will all succeed, so it does land a change only 30 minutes for example.

What you describe is typically an architecture problem: if you have a good architecture in place the problem won't happen because you have already broken your system up so that those places that 10 completely different developers need to touch do not exist in the first place. You need to hire more senior developers to think about this problem and fix it. You should be able to assign every area of code to a small team of developers who work together and coordinate their changes to that area. (even with common code ownership you quickly specialize just because on a large project you cannot understand everything)

There are exceptions. Sometimes there is a management problem: management has been told some things cannot be done in parallel because you couldn't mitigate the problem in architecture and they failed to apply project management practices to ensure the developers worked serially.

Sometimes there is a team problem: the 10 developers have been placed on the same team to work on the same thing, and despite all that they still failed to coordinate among themselves to ensure that the changes happened in order.

The robot won’t merge a change in the queue if it can’t be merged or tests fail. The changeset would be left open and the developer notified to fix it.

The whole process assumes that multiple changes in the queue don’t depend on each other, if they did, it should all be in the same changeset.

It assumes most do not, but it’s entirely possible for someone to change a common library which makes several down stream changes wait. Even if there are no merge conflicts, if they effect the same tests, changes will have to wait.
don't work at uber but have similar problems at my job and i'm quite convinced the problems you ask about are part of the 'not easy' in the OP comment. maybe they can queue whole branches instead of single checkins?