I don't think this makes sense nor is consistent with itself, let alone its other definition[0] > The aim of Open Source is not and has never been to enable reproducible software.
...
> Open Source means giving anyone the ability to meaningfully “fork” (study and modify) a system, without requiring additional permissions, to make it more useful for themselves and also for everyone.
...
> Forking in the machine learning context has the same meaning as with software: having the ability and the rights to build a system that behaves differently than its original status. Things that a fork may achieve are: fixing security issues, improving behavior, removing bias.
For these things, it does mean what most people are asking for: training details.So far companies are just releasing checkpoints and architecture. It is better than nothing and this is a great step (especially with how entrenched businesses are[1]). But if we really want to do things like fixing security issues or remove bias, you have to be able to understand the data that it was originally trained on AND the training procedures. Both of these introduce certain biases (via statistical definition, which is more general). These issues can't all be solved by tuning and the ability to tune is significantly influenced by these decisions. The reason we care about reproducible builds is because it matters to things like security, where we know what we're looking at is the same thing that's in the actual program. It is fair to say that the "aim" isn't about reproducible software, but it is a direct consequence of the software being open source. Trust matters, but the saying is "trust but verify". Sure, you can also fix vulns and bugs in closed source software, hell, you can even edit or build on top of it. But we don't call these things open source (or source available) for a reason. If we're going to be consistent in our definitions, we need to understand what these things are at at least a minimal level of abstraction. And frankly, as a ML researcher, I just don't see it. That said, I'm generally fine with "source available" and like most people use it synonymous with "open source". But if you're going to go around telling everyone they're wrong about the OSS definition, at least be consistent and stick to your values. [0] https://opensource.org/osd [1] Businesses who's entire model depends on OSS (by OS's definition) and freely available research |