Hacker News new | ask | show | jobs
by mardifoufs 1285 days ago
I literally don't know anyone or any team in ML using git as a data versioning tool. It doesn't even make sense to me, and most mlops people I have talked to would agree. Is that really the point of this tool? To be a general purpose data store for mlops? I thought it is for very specialized ML use cases. Because even 1TB isn't much for ML data versioning

Mlops people are very aware of tools that are more suited for the job... even too aware in fact. The entire field is full of tools, databases, etc to the point where it's hard make sense of it. So your comment is a bit weird to me

2 comments

Building mlops solutions for a big tech. Agree, most mature ml teams are not using git for ml data versioning, but in my experience and user research it’s not due to lack of intent. Teams have been forced to move to other ml data tools in absence of scalable git solution, most of which come with a lot of cognitive overhead for ml engineers who don’t want to spend time adopting several custom tools in their ml pipelines.
I think you'll find varying levels of maturity in ML ops. Anyway I think we basically agree, if you use something like this you aren't that mature, and if you are mature you would avoid this thing.