Hacker News new | ask | show | jobs
by zarzavat 173 days ago
There's no reason to believe that weights are copyrightable. The only reason to pay attention to this "license" is because it's enforced by Apple, in that sense they can write whatever they want in it, "this model requires giving ownership of your first born son to Apple", etc. The content is irrelevant.
5 comments

> The only reason to pay attention to this "license" is because it's enforced by Apple

Yes, but the most important reason to pay attention to ANY license for most people is because it is a signal for under what conditions the licensor is likely to sue you (especially in the US, which does not have a general “loser pays” rule for lawsuits), not because of the actual legality, because a lawsuit is a cost most people don’t want to bear while it is ongoing or cover the unrecoverable costs of once it is done, irrespective of winning and losing, and, on the other hand, few people care about being technically legal with their use of copyright protected material if there is no perceived risk of enforcement.

But even if that wasn’t true, and being sued was of no financial or other costs until the case is finally resolved, and only then if you lose, I wouldn't bet much, in the US, in the court system ultimately applying precedent in the most obvious way instead of twisting things in a way which serves the interest of the particular powerful corporate interests involved here.

> There's no reason to believe that weights are copyrightable.

I know this is a long, nuanced, ongoing discussion. I'm very interested in it, but haven't read up on it for years. Could you elaborate a bit on the latest?

I was always in the camp that opined that "weights" are too broad a term for any sensible discussions about conclusions like "are (not) copyrightable". Clearly a weight that's the average of its training data is not copyrightable. But also, surely, weights that are capable of verbatim reproduction of non-trivial copyrightable training data are, because they're just a strange storage medium for the copyright data.

What am I missing?

This. Tables of numbers are explicitly not subject to copyright; that’s a copyright 101 fact.

Any of the code that wraps the model or makes it useful is subject to copyright. But the weights themselves are as unrestricted as it gets.

> This. Tables of numbers are explicitly not subject to copyright; that’s a copyright 101 fact.

Ok, but there's clearly more nuance there. Otherwise I could claim that any mp3 file I wanted to distribute is just a table of 8-bit integers and therefore not subject to copyright.

I wanted to reply in this direction. Ultimately, literally everything and anything in SW is a sequence of numbers, that anybody could easily put in some kind of table form.

I don’t know where the catch is, but that sentence can not be true in general.

A table of numbers is copyrightable if it represents some creative expression by a human being. For example, a BMP representing a sketch is a table of numbers and clearly copyrightable.

Weights are numbers that come from an optimization process. To the extent that weights encode any creativity, they encode the creativity of the training data. But any company using AI models (including Apple) does not want that interpretation because they are using AI models that were trained on other people's copyrighted works. If weights could be copyrighted, we all of us would own them.

That makes sense. Is all about content, not format.
That is simply not true. The details might vary by jurisdiction and the protection might not be under the exact name of “copyright” but there most certainly are comparable legal protections for the contents of databases (“tables of numbers”). See for example: https://europa.eu/youreurope/business/running-business/intel...
Disney would like you have a word with you. Why would their pile of numbers that represent Avatar3.m4a be any more subject to copyright than Apple_2D_3D.bin. Or GPT52.mlx or Opus45.gguf?
It's probably just Apple layers avoiding getting involved in any copyright lawsuit over the copyrightability of weights, by avoiding licensing it except under what's clearly fair use anyway, making copyrightability moot.
Yes this seems more about protecting them from a lawsuit. I don’t think they actually give a shit about the weights or they wouldn’t release them at all. I suspect they just know they’re training dataset isn’t perfectly “clean” and don’t want to accept any more liability than they already have.
[flagged]
Please avoid sneers and swipes on HN. The guidelines make it clear we're trying for something better than this here.

https://news.ycombinator.com/newsguidelines.html

You could make the same mocking argument towards people who find anything good that Apple produces.
Not sure I've met one of those people in... a decade or so? Loving apple products has been an uphill road for a long time (and increasingly more so post-Jobs)
> Not sure I've met one of those people in... a decade or so? Loving apple products has been an uphill road for a long time (and increasingly more so post-Jobs)

You either a deliberately misrepresenting the facts or been livoning under a rock. I mean read any discussion about M laptops and you see apple fanboys noncritucally declaring them a revolution in computing.

> I mean read any discussion about M laptops and you see apple fanboys noncritucally declaring them a revolution in computing.

I see a lot of people extolling the battery life, displays, and trackpads. And probably an equal amount of complaining about the increasingly locked-down and un-customisable nature of macOS. We all like the hardware, and fight the software more by the day.

The blind zealots of the "I'm a Mac, and I'm a PC" era just aren't very common anymore.