Hacker News new | ask | show | jobs
by anileated 514 days ago
Have you considered that you have some traits that make you eligible to read books and access information freely in the country you live in*? Something about being a conscious human being enjoying human rights, perhaps? An implement that does the same but (A) at scale and (B) without thought or free will or agency, completely at the bidding of its operator, for profit, has no such protections. Instead, the operator carries all responsibility (in this case, Meta).

If a software service had legal protections like that, sure, I could build one that returns you any book you request and say that the service had integrated it into its worldview. Who can check, eh?

* Actually, in some countries you could be in trouble for reading a book and incorporating it into your worldview, to say nothing about quoting it, but let’s set that aside.

2 comments

>Have you considered that you have some traits that make you eligible to read books and access information freely in the country you live in*? Something about being a conscious human being enjoying human rights, perhaps?

Not a relevant factor when it comes to copyright law. Fair use (the law that's most applicable here) applies regardless if you're a student using incorporating news articles into your work, or google making thumbnails and displaying them on their search results.

This is not a good analogy. Google does not display the contents to any significant degree (you have to visit the search result). And even then it was/is in legal trouble, in fact (in some countries like Australia* more than others).

Furthermore:

> Examples of fair use in United States copyright law include commentary, search engines, criticism, parody, news reporting, research, and scholarship.

I do not see “automated generation of derivative works of arbitrary nature” in it.

* https://www.bbc.com/news/world-australia-55760673.amp

>This is not a good analogy. Google does not display the contents to any significant degree (you have to visit the search result).

The point isn't that AI training is legal because it's like generating thumbnails. That is being argued in the courts right now. The point is that fair use exemptions isn't limited to "being a conscious human being enjoying human rights", as google generating thumnails and snippets using computers shows.

https://en.wikipedia.org/wiki/Perfect_10,_Inc._v._Amazon.com....

> Examples of fair use in United States copyright law include commentary, search engines, criticism, parody, news reporting, research, and scholarship.

Those are examples, not an exhaustive list. It's not even something that Judges are supposed to compare against when deciding whether something is fair use or not, see: https://en.wikipedia.org/wiki/Fair_use#U.S._fair_use_factors

> The point is that fair use exemptions isn't limited to "being a conscious human being enjoying human rights"

Sure. However, my point is that this is not fair use*, so other principles need to be applied. Whether legal systems in various countries find that fair use applies here or not, I agree we are yet to see.

* At least in cases where it’s an LLM operated at scale for profit (which I suppose would not hold for Meta’s models if they were truly open, but that’s not the case if they require obtaining a license in some conditions).

>Sure. However, my point is that this is not fair use (at least in cases where it’s an LLM operated for profit), so other principles need to be applied.

This isn't a complete argument. Most of AI companies' argument relies on the fact that AI models are "transformative". That's a plausible claim, and as Perfect 10 v. Google, and Authors Guild, Inc. v. Google, Inc. has shown, being a for-profit company is hardly a disqualification from getting fair protection.

“Transformative” is always a grey area. If my service just returns you a book you requested, but in upper case, then it was transformed.

But sure, the “transformative” argument is the one that could apply (and even I believe Google used it to argue its case), if it can be shown that an LLM can not verbatim reproduce a given work (which, incidentally, is something that you, a warm-blooded fleshy human with agency who has the freedom to read books, cannot do, but LLMs were shown to do).

That said, relevant laws existed before LLMs, and may are outdated. If the goal is to balance reasonable uses while protecting original output of authors that ultimately drives innovation and creativity, I am not sure if the preexisting laws are continuing to fulfil their function, but that’s my opinion.

> I do not see “automated generation of derivative works of arbitrary nature” in it

The “automated” isn’t really key. If you read a book, and learn from it, and are able to use that knowledge in other contexts, should you pay a licensing fee? It doesn’t matter if “you” is a human or machine.

“Automated” is key. You are not an automaton, not a machine, you do not infinitely scale with compute power; but unlike a machine you have free will and agency, and legal framework of developed countries grants you human rights that include freedom. That was, in fact, my entire point.
So, your argument is similar to cryptobros who argue that much of defi is not plain financial fraud because it runs on a blockchain only reverse.
Why?
They are arguing that doing something that is illegal if being done by humans is ok because it is on computers running a blockchain.

You are arguing that doing something that is legal if being done by humans is not ok if it is done on computers running an LLM.

I just don’t get your angle. My point was that the human is the one who has some freedoms and the one who bears responsibility. If you read a bunch of books in a book store without buying them and use your imperfect memory of them to do your job better and get paid more, it is shady but if you are not shooed away by store owner you have the freedom to do it. No one can extract the books you already read from your brain, and you did not sign an NDA. But if you set up an industrial scale book scanner in the same store, the boss will call the police on you and you cannot point fingers and say the scanner “reads” books and incorporates them into its worldview just like you would do. Because scanner is not human and you are, so you’re the one responsible for operating the scanner.

I see no difference with cryptotokens here, the human has freedoms to do things and the human is responsible for them if those things are bad. (Just unlike LLMs, theft of property and all that is kinda always a crime, unlike reading a book in a shop without buying.)