| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ogrisel 548 days ago

It's better to be specific:

- open-source inference code

- open weights (for inference and fine-tuning)

- open pretraining recipe (code + data)

- open fine-tuning recipe (code + data)

Very few entities publish the later two items (https://huggingface.co/blog/smollm and https://allenai.org/olmo come to mind). Arguably, publishing curated large scale pretraining data is very costly but publishing code to automatically curate pretraining data from uncurated sources is already very valuable.

1 comments

Palmik 548 days ago

Also open-weights comes in several flavors -- there is "restricted" open-weights like Mistral's research license that prohibits most use cases (most importantly, commercial applications), then there are licenses like Llama's or DeepSeek's with some limitations, and then there are some Apache 2.0 or MIT licensed model weights.

link

cycomanic 548 days ago

Has it been established if the weights can even be copyrighted? My impression has been that AI companies want to have their cake and it it too, on one hand they argue that the models are more like a database in a search engine, hence are not violating copyright of the data they have been trained with, but on the other hand they argue they meet the threshold that they are copyrightable in their own right.

So it seems to me that it's at least dubious if those restricted licences can be enforced (that said you likely need deep pockets to defend yourself from a lawsuit)

link

jcgl 548 days ago

Then those should not be considered “open” in any real sense—when we say “open source,” we’re talking about the four freedoms (more or less—cf. the negligible difference between OSI and FSF definitions).

So when we apply the same principles to another category, such as weights, we should not call things “open” that don’t grant those same freedoms. In the case of this research license, Freedom 0 at least is not maintained. Therefore, the weights aren’t open, and to call them “open” would be to indeed dilute the meaning of open qua open source.

link

seberino 548 days ago

Wait timeout. I thought DeepSeek's stuff was all MIT licensed too no? What limitations are you thinking of that DeepSeek still has?

link

Palmik 548 days ago

I am referring to this one: https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/LIC...

It is a bit more permissive than Llama's it seems (no MAU threshold it seems).

link

seberino 547 days ago

Wow. Your link is frustrating because I thought everything was under the MIT license. Why did people claim it is MIT licensed if they sneaked in this additional license?

link

orra 546 days ago

So, the older DeepSeek-V3 model weights are sadly not permissively licensed.

But the recent DeepSeek-R1-Zero and DeepSeek-R1 have MIT licensed weights.

link

seberino 545 days ago

Thank you very much. That was helpful. Do we need the older model weights to use the recent DeepSeek-R1-Zero and DeepSeek-R1 models?

link