Hacker News new | ask | show | jobs
by helldritch 1809 days ago
This has been really frustrating me lately.

Sometimes I just want to quickly merge a small change (maybe a small config change, 1/2 lines) and then pull on master, branch off and start working again.

I'm regularly having to wait several minutes for the merge and while I know that there are ways around this locally it just annoys me that something so simple is taking so long.

This feels less like an apology and more like Atlassian saying "Us changing a platform which has worked a certain way for years and that breaking your workflow is YOUR PROBLEM. It's you looking at this wrong, merges have been asynchronous all along." despite our many combined millennia of experience being entirely to the contrary.

2 comments

I'm really sorry if it comes across that way. We definitely don't think you did anything wrong. I also get that when things work a certain way for a long time, it's totally reasonable to expect them to keep working that way.

If I could distill the message re: slower merges down to 2 essential points, they would be this: (1) we underestimated how impactful this would be for some customers and that's on us; (2) some, in fact I think many, users believed they need to wait for the merge to complete and we wanted to clear up that misunderstanding.

From your use case, where you merge a small change and then want to pull, create a new branch, and start working again right away, I understand this directly affects you. I am surprised merges would be super slow for you if you're just merging small changes, though; average merge times are still just a few seconds. Have you opened a support case?

For many other users, I do think the UX changes we're rolling out will make a difference. There are a lot of users who would click merge, maybe 5-10 seconds would pass, and they would assume something must be wrong so they'd refresh the page and then it would look like nothing happened. Today we pushed out an update so that if you refresh the page and the merge is still in progress, you'll actually be able to see that.

FWIW we do have some longer-term work in progress that will make merges (along with basically all file system I/O) a lot faster; but it's a ways off and represents yet another significant architectural project (though much less disruptive than this one!). I didn't mention it in the article because it will take a while.

I know we don't have to wait for the merge but having an operation that is syncronous and takes <1s on my comptuer but being asyncronous and taking 30 seconds to five minutes on Bitbucket is quite infuriating. Often I want to merge a PR, then switch to my term, pull master, and start branching from the new head. Now I cannot do this. Also I want to merge a bunch of PRs it gets even more confusing.

I suggest you up the capacity on the queue so it feels syncronous and snappy like GitHub, as this is now a pain point.

We know that CI build efficacy drops with duration. People wander off to do other things, and either come back to it later than anticipated (all estimates are off by 2x, including estimating when I will check the build again), or forget entirely. Merge-build is one action, and making merges async increases the perceived build time. Perception of sequential delays is a fundamental UX concept. They are seen as a single, longer delay.

That’s not the only reason merges are synchronous, especially in the Atlassian world.

At scale, we have problems with build triggers from commits. Sometimes you have to fire manually.

At scale, tracking deployment failure is painful. The PR you merge may be in a different module than the deployment plan. So I need to merge a PR, then watch the dominoes fall. But the bigger issue is that Atlassian never finished getting deployments up to feature parity with builds. I have several dashboards that report build health, but deployment health is hard to track. It involves more vigilance and frankly it’s exhausting. Exhausting things get dropped every time people feel tired. To work around this, first you do alerts to team chat, then there are too many alerts so they go to their own channel, then people forget to check the channel, and we have a smaller version of the same problem and no solution.

At scale, I can only control whether MY code has enough tests to detect regressions prior to deployment. Breaking preprod impresses no one, even if automation should have caught it. That means a workflow of merge-build-test prior to moving on to the next task.