In addition to what others posted here, sometimes it is nice to put generated files under version control with the source code that generated them. For example, simulation results, deep learning models, graphs that you need to include in your LaTeX documents (which can be considered partly source code, partly generated content).
Also, deep learning training data often consists of large image files, and can also be considered "source code", and in any case it can be very useful to put these under version control.
And finally it can be useful to put external dependencies as tar-files into your source tree.
I appreciate you laying this out because it's something I have struggled with and thought I just didn't know the right way to handle it.
For writing tests in a deep learning code base, rather than simply including a native data file (image, CSV, whatever), I've taken to writing a fake data creator class. It always feels like overkill when an alternative solution is including a native data file or two that already exists.
I want to use it for files not just source code. For instance including graphics, the pdfs generated by latex or just anything. Or storing lots of directories. Currently, I use dropbox for that, but if git could do that...
though i agree that there isn't a proper vcs out there for large files (adobe bridge was a nice attempt) git wasn't designed for that and one might wonder whether you want git to be _that_ multi purpose.
If you have a significant project, you want to store at least a reasonable amount of media with it. Images, documentation. Git doesn't necessarily have to be the best system for handling high-gigabyte sized binaries, but should at least deal gracefully, with small and medium sized binary files. I am also not sure, why not more effort was spent making Git support large files even well.
By the way, if you're storing large files under version control in Git it is often useful to use the "--depth=1" flag when cloning or pulling repositories. That way you only download the stuff you really need and leave the rest of the history on the server until you need it.
I recently had to deal with a PowerPoint document which is slightly larger than 50 megabytes, which in todays terms, isn't very much. Before, I had kept it in SVN and that has no issues storing larger files. It is a bit shocking that Git has issues with not so tiny files.
It doesn't matter actually. Tools should be easy to learn and as free of edge cases as possible. An example happening is constant propagation in Rust: it's the same feature, but with every release it can cover more of the code base.
Also, deep learning training data often consists of large image files, and can also be considered "source code", and in any case it can be very useful to put these under version control.
And finally it can be useful to put external dependencies as tar-files into your source tree.