Hacker News new | ask | show | jobs
by elevation 1928 days ago
I like self hosting git but these tutorials set you up with only a one-machine solution. I'd like to be able to self-host a git service that's robust in the face of network/hardware/OS maintenance.

I know git is distributed by design. So if I want to push code to a pair of servers for better availability, I can do it explicitly:

  git push <remote1> <branch>
  git push <remote2> <branch>
But what if I wanted to make this transparent but still highly available, such that the remote URL in

  git push <remote> <branch>
is actually backed by a HA cluster?

Some of the software and ops to make this happen is Github's secret sauce. I'm not looking to compete with them, but would love an open source solution that had a better uptime than a single digital ocean droplet running debian. Ideally, I could get there without green-fielding raft consensus shims into a modified git binary.

3 comments

Plenty of alternative approaches that can get you there. Here’s one:

  * distributed file system like gluster or ceph for repos
  * clustered db (eg replicated Postgres)
  * redundant instances of gitea
  * load balancing

Am I missing something?
Load balancing redundant gitea over clustered postgres and clustered fs provides a resilient read-only stack.

The trouble comes when the system receives two simultaneous pushes to the same branch. When ceph goes to merge them, which one wins? There has to be a distributed write mutex. Perhaps this mutex could be acquired in a pre-commit hook on the gitea nodes, but it's absolutely necessary (in addition to the other clustered services) to prevent silent data loss or corruption.

post-receive hook pushing everything to other instances. How the push happens could be git remotes, rsync, or whatever else you want.

  git remote set-url --add origin $second_url
Not quite HA cluster levels of redundancy, but it's also way simpler to set up.
I actually have this at work for a repo where we want to both push to the server where it'll be deployed (so this server doesn't need some sort of access to another system, trust relations going only one way) as well as to the source repo where we actually collaborate and have an issue tracker etc. Works great for us. Yes, of course you can setup some limited access for this server to pull from the original source but... small team with high security requirements, so this low-tech solution is quite perfect.
I push to a couple places that have a main git server that everyone pushes to/through and then that somehow pushes everywhere else.

post-receive hook[1] can be used to automate that,

[1] https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks#_po...