|
|
|
|
|
by ltbarcly3
1285 days ago
|
|
The lack of reasons for doing it IS the reason against. GIT isn't a magic 'good way' to store arbitrary data, it's a good way to collaborate on projects implemented using most programming languages which store code as plain text broken into short lines, where edits to non-sequential lines can generally be applied concurrently without careful human verification. That is an extremely specific use case, and anything outside of that very specific use case leaves git terrible, inefficient, and gives almost no benefit despite huge problems. People in ML ops use git because they aren't very sophisticated with programming professionally and they have git available to them and they haven't run into the consequences of using it to store large binary blobs, namely that it becomes impossible to live with eventually and wastes a huge amount of time and space. ML didn't invent the need for large artifacts that can't be versioned in source control but must be versioned with it, but they don't know that because they are new to professional programming and aren't familiar with how it's done. |
|
Mlops people are very aware of tools that are more suited for the job... even too aware in fact. The entire field is full of tools, databases, etc to the point where it's hard make sense of it. So your comment is a bit weird to me