Hacker News new | ask | show | jobs
by enneff 4236 days ago
There are a lot of things lacking about GitHub's code review process (pull requests). Off the top of my head:

- Merging a pull request (almost) always creates a merge commit, polluting the change history. Gerrit will automatically rebase the changes atop the master branch head, leaving a nice linear history.

- Pull requests require the contributor to create a public fork of the repository they're committing to. I can see how this works for some people, but I find it gross for each contributor to the Go project to have their own public fork. What a mess.

- Comments on pull requests are sent as soon as they are created. Gerrit allows you to make many comments on a change, as drafts, and then send them all in one go, sending just a single email. This is much easier to manage for a large project.

- Gerrit has the notion of multiple 'patch sets' for a particular change, and you can see diffs between patch sets, so it's much easier to progressively review large changes.

And there are many other small issues with the GitHub pull request process that make it untenable for the Go project.

4 comments

I'm really curious to give Gerrit a try. Is it like the existing Rietveld system used for Go?

- I've been using the "apply mail" flow to avoid merge commits and to squash commits when appropriate (based on Nathaniel Talbott's post http://blog.spreedly.com/author/ntalbott/#.VGVhz5PF_Zs). It will be nice to have something automatic.

- At least contributors with the commit bit can just create a feature branch and a pull request from there. Some open source projects give out the commit bit almost immediately, but with GitHub there's still that initial fork and pull request to prove oneself.

Interested to see how Gerrit addresses this. GitHub doesn't have any way to disable pull requests. :-(

- I've completely disabled email notifications for this reason. There still can be a huge number of notifications in app or via third-party mobile apps. Rietveld is a lot more sane, I only saw notifications for things I was actively working on.

- I usually end up reviewing just the new commits individually. But that requires figuring out which ones I had already reviewed. Not so bad with only a few branches, but I can see it getting out of hand.

I really hope people from GitHub are reading this thread as they are working to improve their tools. :-)

We're going to build a bot that replies to pull requests to the Go core. It'll say "Sorry, we don't take pull requests, but here's how to use Gerrit [link]. Thanks!"
FYI if you have a "CONTRIBUTING" or "CONTRIBUTING.md" document at the project root, a neat little info bar "Please read the [contributing] guidelines before proceeding" will show up to anyone filing a bug or a PR.
Not to be flippant, but if you are not going to use pull-requests, what is the benefit of moving to Github? Will you use the issue tracking system?

If not then it just seems like switching to git using the present google code host would suffice from a technical viewpoint.

We will use the GitHub issue tracker, which will allow easy cross references between go issues and other projects on github.
I have to agree that not using pull requests is missing out on the best feature of github. It's the social, low-overhead way to contribute to a repo.

It seems like it must be possible to hook up a bot to github that watches for pull requests and submits them to gerrit... the Juju team has a bot that does that for our Reviewboard integration. It won't work for more complicated workflows that github doesn't support, like dependent branches or whatever, but it lowers the barrier of entry for initial contributors that just want to submit a simple fix. And the more advanced users can just submit directly to our Reviewboard instance if they want the more advanced features.

Also, the public fork on Github is actually a feature - it means others can easily see what you're working on. You need off-site storage for your personal work anyway, right? You're not going to keep it local on your laptop, so you'll need a branch somewhere regardless.

GitHub code review is terrible. I don't blame them for not using pull requests in the least. If not being able to pull is enough to stop people from contributing, how much is their contribution worth?
I think this is a great move. Also, very similar to the setup used for the OpenStack (http://openstack.org/) project. All reviews done in Gerrit, published to Github when merged. OpenStack also has the bot which closes GitHub pull requests and redirecting to Gerrit. More on their Gerrit/Github integration here: http://ci.openstack.org/gerrit.html For those interested in doing the same for their own, private projects, there is http://forj.io (full disclaimer, I work on this project). Finally, Gerrit now has an async / TTY based client which works offline (reviews on the go!) with Gertty - https://github.com/stackforge/gertty
Smart!
Gerrit is a Rietveld fork, see http://en.wikipedia.org/wiki/Gerrit_(software).
> Pull requests require the contributor to create a public fork of the repository they're committing to.

You can create Pull Requests within a repository. You don't need to fork the repository if you have push access.

Go contributors will not be pushing to the GitHub repository, nor will have direct commit access to the Gerrit hosted git repository.
Nice summary; thank you. I don't know why people put up with the merge commit pollution. Great to see Go taking the best of both worlds.
With git and github you can have the best of both worlds. merge commits (seperate tree), linearized at the end (--no-ff) or rebased commits (pull --rebase).

1 and 3 have their usecases, 3 is esp. useful if the branch contains only 1-2 commits, and 1 is useful if the branch contains > 10. You can add test information into the merge commit.

I hope github will take more gerrit features into their tracker or allow tracker extensions, so that we can live with github alone. gerrit still is a UI and performance nightmare.

The merge history is the true one, it's what you want when browsing backwards to see how particular things changed. Rebase creates fake commits corresponding to a tree that never actually existed anywhere. Your history viewer should be able to linearize the tree for you if you want, but the canonical history in the repository should be the true one.
That's just one workflow.

For Go, we will require that contributors (everyone working on Go) prepare their commit at head (or at least make the commit able to be automatically rebased). It makes no sense to have a merge commit for every single commit. That would add no extra information and double the size of the commit log.

It's not a merge commit for every single commit, only for every, well, merge. I sure hope people are writing new features or bugfixes in several commits, not just doing days of work and committing at the end (though I guess that would be a workflow for people who are worried about the size of the commit log).

What's the purpose of keeping a commit history at all? Really, what are your use cases? For me, the history is for looking around and understanding a particular change, or for bisecting to help understand a bug. Both of these use cases are better supported by seeing the tree as it actually was when someone was working on it, than by seeing an artificial, rebased history. What's your use case where the rebased version makes sense?

This is a fantastic answer on Github's (rare) shortcomings. I hope the Github team is here reading this and taking notes!