Hacker News new | ask | show | jobs
by kadendogthing 2614 days ago
As I've stated in another post on here, what's the point of these articles? It just says everything sucks, but doesn't really dive into why or how we could possibly fix any issues they may directly point out. Also it kind of sounds like the author really doesn't have any idea what GitLab is or does, so maybe he should check it out.

But allow me to retort these bald assertions presented in the article:

Monorepos are great.

Multirepos are great.

Git is the best source control system ever. And if you think it could do something better, well have I got news for you. It's completely open source and extendable with various script entry points and an easily accessibly API.

Thanks for reading my blog.

3 comments

> Git is the best source control system ever.

To be clear, I'm not disagreeing. But it is simply not good enough. Any new generation of source control needs to be able to do things that are difficult with Git, and Git simply isn't extensible enough. Microsoft has a Git VFS, and there's Git LFS, but this just doesn't go far enough.

There are good technical reasons why you would use Perforce or even Subversion these days.

The people who made Git made it for working on large, but not huge, open-source code repositories with a traditional model. It doesn't work so well for vendoring, it doesn't work well for artists, it doesn't have locking, it doesn't have access controls (and there's only so much you can add). You can argue that these features don't make sense or we're using Git "wrong" or I can write a bunch of hooks but at some point I just want them to work and I'm tired of fighting with Git to make it happen.

Just personal background, these days I work with closed source and open source, monorepos and multirepos, Git, Subversion, and Perforce all on a regular basis (and sometimes use weird custom setups). Git is by far the most familiar of the three, and I've published some tools for Git repo surgery.

> There are good technical reasons why you would use Perforce or even Subversion days.

Can you say more? What are some of those reasons? Or link to some data or examples?

With a monorepo, how do you avoid the situation where almost every time you want to commit, you have to pull-and-rebase first? Because somebody has always pushed a change, every minute or two.
In a repo that large, you don't want to have random people pushing to master anyway. Have people commit to branches, and then automation merges the approved branches into master. ("Automation" may be as simple as the "Merge" button in Github's UI, or more complex if necessary.)
With git, can you set specific user permissions by directory? We need a way to prevent pull or commit by certain users.

Or require a review before committing to some projects/dirs, but not all.

Simplicity. I understood SVN immediately but I'm still struggling with Git.

It's only one thing and perhaps the only one, but it's a huge one. IMO anyway.

Partial checkouts is the main thing that comes to mind that is better with svn than git.
git archive is ... ok, not pretty great, but it works
That's not really a working copy, though. What some need is a tool that lets you check out a part of the repository as a working copy, without checking out the rest. By "part" we might mean more than one directory and its descendants (i.e. not a single root).
hmm, you could frankenstein together a bunch of trees to make it look like a partial checkout, but you couldn't make a new commit without all the parent tree objects up to the root. This sounds just like a subtree to be honest.

Are you frequently checking out a subdir of a repo and committing changes to it? Is it config?

> it doesn't have access controls (and there's only so much you can add)

so there's a lot of drawbacks to using gitolite but we were able to customise access controls down to allowing some users the ability to only change lines of checked-in config only to certain values

How do you prevent users from reading certain parts of the repository, though? This was what I meant by "there's only so much you can add"... you can reject pushes that change parts of the repo, but you can't prevent reads without breaking everything.
> can't prevent reads without breaking everything

I don't understand, you can lie to git-upload-pack and send anything you want to the user?

but when we used gitolite, we put sensitive stuff in a separate server and restricted reads to trusted users/deployment tools

edit oh I see, you want to let some people clone the repo but with some stuff redacted and still be able to make changes to the non-redacted stuff. I'd used LFS and move the ACLs to the file server, if using a single repo was a hard requirement

> I'd used LFS and move the ACLs to the file server, if using a single repo was a hard requirement

If you're putting a few large files in LFS, or maybe a couple sensitive files, I can understand and I'd say you're still using Git, but with some extensions.

If you're putting an entire sensitive subtree in LFS, I don't think you're really using Git any more, in the sense that many of your standard Git workflows will have to be different.

You have completely missed the point of my post. The point was my post had as much substantial points as the article in question, with less words obviously. Which should be obvious but never get in the way of a good tech contrarian article stating everything sucks I suppose.
Ah, thank you for the clarification, I had interpreted your comment far too generously.
Out of genuine curiosity, do you know of any resources you could point me to to explore specific situations where Git isn't fitting the bill technologically?
The big one is anything that requires locking. Git by nature doesn't support locking.

The next one is repository size. Anything with extremely large history size or checkout size. Very hard to work with using Git. Microsoft has Git VFS. Facebook and Google have modified versions of Mercurial and Git.

High repository velocity. If you are trying to push to a remote and you are always out of sync, it's going to slow you down.

Checking out different commits for different parts of the tree. This one is a bit more rare, it's less common that you'd want this.

Finally, setting ACLs to deny read access to parts of the tree.

For all of these cases, there are some ways you can work around the problem. It's not like you're completely dead in the water with Git, it's not like these things are completely impossible to do in Git. It's just that Git isn't good at everything. It's just that Git is exceptionally good for most people who write code.

One thing about GitLab tooling is that they have features that apply only on a per-repo basis, for example GitLab CI.

Suppose for example we have 2 distinct projects - a backend and frontend, which each have their own testing and deployment strategy. GitLab CI only allows one CI pipeline config per-repository. While we could take care of that with scripting, that can easily get out of hand as we increas the number of distinct "projects", if we wanted to maintain a monorepo. So the tooling encourages us to have separate repos.

However if we do that, since we don't have that convenient single commit hash that a monorepo gives us, then we don't have a good way to ensure that the deployments between projects are synced up, and rollbacks are far more complicated.

Its a contrived example (for instance we could switch to a different CI system and mitigate this issue), but it seems to me that whatever an organization chooses, mono- or poly-repo, they have to build complicated custom configurations and tooling to get over whatever tradeoffs their decision has. And as the number of logical projects (repos, submodules, etc.) and the commit rate increases, then the tooling has to increase in complexity to handle issues of scale.

So I guess the open question is, is there a way we can somehow have both without spending a bunch of engineering cycles writing custom configs and tools?

GitLab Product Manager for CI here

Thank you for this feedback - it’s something we’re thinking about a lot too. We’ve made some improvements for monorepos (`changes:` keyword) and for micro-services/multi-repo (`trigger:` and `dependency:` keywords) but we’re not satisfied!

We have two open Epics - one for making CI lovable for monorepos (https://gitlab.com/groups/gitlab-org/-/epics/812) and one for making CI lovable for microservices (https://gitlab.com/groups/gitlab-org/-/epics/813). Would love community feedback on the direction those will take us and how we can up level lovability even more.

Thanks for listening! We're all big fans of GitLab overall. Good to know you guys are working on filling the feature gap there.
> single commit hash

sounds like you have a deploy & release issue, not a developing or publishing one. Octopus Deploy was the first system I saw that make a distinction between them, and it eliminated a swathe of issues by simply saying "a release is a set of versioned packages"

wow octopus deploy got expensive

>what's the point of these articles?

Content marketing.

Sorry for unrelated comment, but I remembered your post about Lyft here and whether Vanguard VTSAX holds it [1]. They updated the holdings on 3/31 and it now shows a $3.4 million holding of ~44k shares of Lyft Class A.

[1] https://news.ycombinator.com/item?id=19640055