No, it's not. The Llama 3 Community License Agreement is not an open source license. Open source licenses need to meet the criteria of the only widely accepted definition of "open source", and that's the one formulated by the OSI [0]. This license has multiple restrictions on use and distribution which make it not open source. I know Facebook keeps calling this stuff open source, maybe in order to get all the good will that open source branding gets you, but that doesn't make it true. It's like a company calling their candy vegan while listing one its ingredients as pork-based gelatin. No matter how many times the company advertises that their product is vegan, it's not, because it doesn't meet the definition of vegan.
There are more licenses than just MIT that are "open source". GPL, BSD, MIT, Apache, some of the Creative Commons licenses, etc. MIT has become the defacto default though
These discussions (ie, everything that follows here) would be much easier if the crowd insisting on the OSI definition of open source would capitalize Open Source.
In English, proper nouns are capitalized.
"Open" and "source" are both very normal English words. English speakers have "the right" to use them according to their own perspective and with personal context. It's the difference between referring to a blue tooth, and Bluetooth, or to an apple store or an Apple store.
This isn't helpful. The community defers to the OSI's definition because it captures what they care about.
We've seen people try to deceptively describe non-OSS projects as open source, and no doubt we will continue to see it. Thankfully the community (including Hacker News) is quick to call it out, and to insist on not cheapening the term.
This is one the topics that just keeps turning up:
Unless a registered trademark is involved (spoiler: it's not) no one, whether part of a so-called "community" or not, has any authority to gatekeep or dictate the terms under which a generic phrase like "open source" can be used.
Neither of those usages relate to IT, they both are about sources of intelligence (espionage). Even if they were, the OSI definition won, nobody is using the definitions from 1995 CIA or the 1996 InfoConWar book in the realm of IT, not even Facebook.
The community has the authority to complain about companies mis-labelling their pork products as vegan, even if nobody has a registered trademark on the term vegan. Would you tell people to shut up about that case because they don't have a registered trademark? Likewise, the community has authority to complain about Meta/Facebook mis-labelling code as open source even when they put restrictions on usage. It's not gate-keeping or dictatorship to complain about being misled or being lied to.
The OSI was created about 20 years ago and defined and popularized the term open source. Their definition has been widely accepted over that period.
Recently, companies are trying to market things as open source when in reality, they fail to adhere to the definition.
I think we should not let these companies change the meaning of the term, which means it's important to explain every time they try to seem more open than they are.
Yeah a lot of people here seem to not understand that PyTorch really does make model definitions that simple, and that has everything you need to resume back-propagation. Not to mention PyTorch itself being open-sourced by Meta.
That said the LLama-license doesn't meet strict definitions of OS, and I bet they have internal tooling for datacenter-scale training that's not represented here.
Source available means you can see the source, but not modify it. This is kinda the opposite, you can modify the model, but you don't see all the details of its creation.
> Source available means you can see the source, but not modify it.
No, it doesn't mean that. To quote the page I linked, emphasis mine,
> Source-available software is software released through a source code distribution model that includes arrangements where the source can be viewed, and in some cases modified, but without necessarily meeting the criteria to be called open-source. The licenses associated with the offerings range from allowing code to be viewed for reference to allowing code to be modified and redistributed for both commercial and non-commercial purposes.
> This is kinda the opposite, you can modify the model, but you don't see all the details of its creation.
That's not the training code, just the inference code. The training code, running on thousands of high-end H100 servers, is surely much more complex. They also don't open-source the dataset, or the code they used for data scraping/filtering/etc.
It's not the "inference code", its the code that specifies the architecture of the model and loads the model. The "inference code" is mostly the model, and the model is not legible to a human reader.
Maybe someday open source models will be possible, but we will need much better interpretability tools so we can generate the source code from the model. In most software projects you write the source as a specification that is then used by the computer to implement the software, but in this case the process is reversed.
[0] - https://opensource.org/osd