Hacker News new | ask | show | jobs
by balajis 6298 days ago
Best way I've found to do joint versioning of code with large datasets (whether binary or tab-delimited text):

1. Check in symbolic links to git. You can include the SHA-1 or MD5 in the file name.

2. Have those symbolic links point to your large out-of-tree directory of binary files.

3. rsync the out-of-tree directory when you need to do work off the server

4. Have a git hook check to see whether those files are present on your machine when you pull, and to update the SHA-1s in the symbolic link filenames when you push

By using symbolic links, at least you have the dependencies encoded within git, even if the big files themselves aren't there.