Hacker News new | ask | show | jobs
by israrkhan 424 days ago
Agreed. Also their name make it seem like it is totally new model.

If they needed to assign their own name to it, at least they could have included the parent (and grant parent) model names in the name.

Just like the name DeepSeek-R1-Distill-Qwen-7B clearly says that it is a distilled Qwen model.

1 comments

DeepSeek probably would have done this anyway, but they did release a Llama 8B distillation and the Meta terms of use require any derivative works to have Llama in the name. So it also might have just made sense to do for all of them.

Otoh, there aren't many frontier labs that have actually done finetunes.

> the Meta terms of use require any derivative works to have Llama in the name

Technically it requires the derivatives to begin with "llama". So "DeepSeek-R1-Distill-Llama-8B" isn't OK by the license, while "Llama-3_1-Nemotron-Ultra-253B-v1" would be OK.

> [...] If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name.

I've previously written a summary that includes all parts of the license that I think others are likely to have missed: https://notes.victor.earth/youre-probably-breaking-the-llama...