Hacker News new | ask | show | jobs
by pbd 296 days ago
The binary file problem is real - audio files balloon Git repos fast. Git LFS helps but adds complexity. I've had better luck with a hybrid approach: Git for project files/metadata/stems, cloud storage with versioned folders for final mixes. Keeps the repo lean while maintaining the branching workflow benefits.
4 comments

I'm gonna guess that the OP (and many others) might use Git purely for the source file. I don't know about other DAWs but Ableton which is my DAW of choice has .als (which is usually pretty small) which contains references to the WAV and other dependent files. Of course, this doesn't solve the (common) problem of losing all your audio files but it does maybe make this approach a bit more feasible...
I've never really used it for anything serious, but doesn't Mercurial handle binary files better? I thought there was a more clever binary diff system for it.
I’m sure if GitHub made LFS storage free, adoption for large projects would 100x, and LFS bugs would be fixed fast.
I just learned about git-annex, which provides more options for storage

  > git-annex is not git-lfs, which also uses git smudge filters, and appears to lack git-annex's widely distributed storage and partial checkouts.[1]
https://git-annex.branchable.com
Are there any music formats that allow, conceptually, for easy diffs?

If there are, it's not beyond reason to add something to git to make it work better.

If changing a single bit at the start of file changes the whole thing then it's really a failing of the file format. By which I probably mean the container format.

Git presents things to you as patches, but it doesn't really uses patches internally [1], so having a specif diff program doesn't help - except maybe to get readable patches - because IIRC, this can be customized.

A version control system that would/is using patches internally would be more space efficient, but probably slower as it would have e.g. to apply all patches from version 0 to reconstruct the current version. Git made its choice with regard to this frequent dilemma and its original purpose, which was to version control the Linux kernel source code.

> If changing a single bit at the start of file changes the whole thing then it's really a failing of the file format. By which I probably mean the container format.

Audio formats are typically compressed, so this is hardly avoidable. Compression is a bit like encryption in that regard, except that (good) encryption deliberately introduces random data [2].

[1] https://jvns.ca/blog/2024/01/05/do-we-think-of-git-commits-a... (or long story short:) https://news.ycombinator.com/item?id=13644631

[2] https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation

> Audio formats are typically compressed

Not in typical music production workflows.

Yup, I know how git keeps the whole thing but if the audio file is stored as chunk, with perhaps layers applied, that could be storing just the changes as say mblobs, rather than blobs.

The reason for asking about diffs is that it makes it easier to break it down to manageable chunks.