Hacker News new | ask | show | jobs
by kernelsanderz 1351 days ago
I'd love to see a well-supported git-lfs compatible client/proxy (so you could more easily move backends) that could run on top of S3/object storage. Yes, and written in a modern language like golang/rust for performance / parallelism. There's some node.js and various other git-lfs proxies out there, but not well enough maintained that I could count on them being around and working in another 5 years. git-annex at least has been around for a while, even though it has its issues.

Huggingface uses git-lfs for large datasets with good success. git-lfs on GitHub gets very pricey at higher volumes of data. Would love the affordability of object storage, just with a better git blob storage interface, that will be around in the future.

Most of these systems do their own hash calculations and are not interchangeable with each other. I feel like git-lfs has the momentum at the momentum in data-science at the moment, but needs some better options for people who want a low cost storage option that they can control.

Huggingface is great, but it's one more service to onboard if you're in an enterprise. And data privacy/retention/governance means that many people would liek their data to reside on their own infrastructure.

If AWS were to give us a low cost git-lfs hosted service on top of S3 it would be very popular.

If anyone knows of some good alternatives, please let us know!

1 comments

Did some more research to see if anything had changed in this space. I found two interesting projects (haven't used them myself yet though):

One in C# (with support for auth)

https://github.com/alanedwardes/Estranged.Lfs

One in Rust (but no Auth, have to run reverse proxy)

https://github.com/jasonwhite/rudolfs

Both seem interesting. Anyone use these?