Hacker News new | ask | show | jobs
by sebastian_io 1936 days ago
Thanks for listing out the points you disagree with! The project is still in early alpha, so is the Readme. Therefore anything that is ambiguous or not clear, is worth to address.

The main requirement is performance (a missing point in your list). If Git would be a good candidate as a versioning system for DCC software packages, it would have been picked up by now, but it didn't happen, among others because of the reasons listed above. Git addresses a completely different target audience and lifecycle than SnowFS. The commit hash integrity is a problem in CG/VFX productions, so is the 4GB limitation, as well as the I/O performance for large binary files. The fact that these issues are still there are fully understandable, given the responsibility and dependencies of this project. That's why SnowFS tries to address the niche requirements with its light implementation.

In terms of the license, this is intentionally the weakest argument of all. It doesn't prevent anyone under the GPL to ship Git as an external program with a commercial software, same counts for libgit2 with its linking exception. So there is not even a real benefit here. But the chosen MIT license is an open invitation for everyone.

P.S. Certain features and technical solutions will be feature-proposed to libgit2

2 comments

> Thanks for listing out the points you disagree with!

> The main requirement is performance (a missing point in your list)

The points I listed are not selected by me! I listed all the points in the project README verbatim. Other than the 2nd-to-last point (slow binary mod detect), performance wasn't otherwise listed in the README, so I don't know why you're calling me out for omitting it.

> If Git would be a good candidate as a versioning system for DCC software packages, it would have been picked up by now

You seem to be implying something else was picked up instead of Git? Other than SnowFS (which you say is in early alpha), what else has surpassed Git in this space? If nothing else has yet been picked up in place of Git, this argument isn't applicable.

> The commit hash integrity is a problem in CG/VFX productions, so is the 4GB limitation, as well as the I/O performance for large binary files

You seem to be again comparing SnowFS to Git-without-Git-LFS (other than the 4GB limitation which I already addressed in my comment). This is, as I said, disingenuous. Why keep making this selective comparison?

> P.S. Certain features and technical solutions will be feature-proposed to libgit2

That's cool, and I wish them the best of luck with developing these solutions. As I said, I've no issue with SnowFS; new approaches are always cool. I just think their listed justification is disingenuous; they'd be better of simply stating they want to develop something new and leave it at that.

> Other than the 2nd-to-last point (slow binary mod detect), performance wasn't otherwise listed in the README, so I don't know why you're calling me out for omitting it.

The ones I was referring to for performance are:

- Support for instant snapshots

- Support for instant rollback

About alternatives, Perforce and PlasticSCM are currently commonly used. But I understand your objections, and will check if I can handle certain things differently in the README. Thanks again for your input!

Thanks, and apologies it if came across overly critical.

I just think things like this can be impactful to perceptions of the work done on efforts such as Git-LFS, and describing things based on their own merits is often a better approach than pointing out lack elsewhere.

A VCS UI for design is something I've been looking for for a LONG time, so I signed up to the SnowTrack public beta immediately. I was just a little confused/concerned to learn it won't have a widely-supported backend to ease things such as synchronisation across devices, sharing resources via a hosted service, etc. Curious to see how this gets handled in the final product.

I really appreciate your critical view on the project, because it makes me reflect my own stand and arguments and to see if they are correct or wrong. E.g. I just removed the "without LFS" argument because you are right, this is not a sustainable argument and I will address a few more soon to clear things up.

I learned a lot during the development of SnowFS and the open-source-community is the best place to share my experiences, that's why I put it on GitHub. At the end, I would be super happy if these insights can make it over to Git and Git-LFS.

We have a Discord channel, you are very welcome to swing by anytime for a virtual beer :-)

Have you looked into git-annex?

Git annex lets you track references to binary files, only using git for storing references to file hashes.

And you can use custom back ends to efficiently store differential data.

For example, I have an annex repo that stores about 150G of text files, but it uses bup to compress it down to about 20G, while I can still have access to different versions via git.

1: https://git-annex.branchable.com/special_remotes/bup/

Impressive numbers! Unfortunately I know git-annex only on paper. I gave it a try a while ago, but it was a bumpy start, admittedly most likely user-error. Would you mind sharing some details about it (e.g. file numbers, etc)? Can I invite you for a chat? Doesn't need to be long, but might be more suitable for a chat
Sure, how would you like to get in touch? You have a discord, right? I actually was looking at your project and was thinking of opening a simple PR. (same username)

I have some more examples git-annex repos:

This is an annex repo I made of this popular abandonware website:

https://github.com/unqueued/repo.macintoshgarden.org-fileset

And some podcasts

https://github.com/unqueued/radiolab-fileset

https://github.com/unqueued/ratholeradio-archive

What's cool is that people can use standard pull requests to add files to the repo. And the repo itself is small, but it can represent huge filesets. Datalad has some really fascinating medical imaging data repos that are massive (https://www.datalad.org/datasets.html).

If you wanna see a really good example of a repo with versioned binary files, check this out the git annex repo of previous git-annex binary releases:

https://downloads.kitenet.net/.git/

You can just use standard git workflows to see previous revisions of a file (well, previous hashes) but it is really easy to hook into.

Very excited for a PR. Any help and support is very welcome. :-)

I just cloned one of the repos, seems I really should look more into annex. Feel free to join the Discord channel, that would be the easiest to go from there