| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by orra 1131 days ago

> Model weights can't really be described as source code though. The equivalence isn't exact, but I'd describe the weights more as the compiled binary, with the training data & schedule being the source

I think this is a really interesting discussion! I see where you're coming from, but I'm minded to disagree in part.

For one, I think it's possible to release model weights under a liberal licence, yet train on proprietary data. (ChatGPT is trained on oodles of proprietary data, but that doesn't limit what OpenAI do with the model). Normally, obviously, the binary is a derivative work of the source.

Also, the GPL defines source code as 'the preferred form for modification'. I don't disagree that model weights are a black box. But we've seen loads of fine tuning of LLaMA, so we don't always need to train models from scratch.

Ideally, of course, having both unencumbered training data and model weights would be perfect. But in the interim, given I don't have that million dollars, I'll settle for the latter.

1 comments

hafriedlander 1131 days ago

Yeah, neither view is a perfect fit. Another example is vision transformer backbones, where a common generic base weight is used to fine-tune all sorts of different processes (segmentation, image to text, etc). The terminology (and licenses) haven't really kept up.

A properly unencumbered model would be my preference too. The community generally seems a bit laissez-faire with license compliance though, so the restrictions currently don't generate much push back. (Plus it's not totally clear that you can copyright model weights at all, given they're the output of an automatic process).