Hacker News new | ask | show | jobs
by 93po 529 days ago
I would say it's more, I checked out a book from the library, read it, and learned some things about writing style and storytelling that I'm now going to apply to my own original works.
3 comments

Libraries obey copyright, loaning out books for which they've acquired some right to lend to members. When I borrow a library book and read it that way, everything that happens is respecting the rights of the copyright's owner.

That has nothing to do with how LLM's were trained. They were trained on countless works for which Meta, etc had acquired no legitimate right for use at all.

i dont know of a law that says you have to purchase a book to be legally allowed to read it
The legal owner of the book has to allow you to read it. And the legal owner can't make additional copies to allow you to read it.
If I find a book on a park bench and read it, am I breaking the law in terms of intellectual property?
If they're training LLMs on books found on park benches, we don't have a problem. That's obviously not what we're talking about though.
My point is "the legal owner of the book has to allow you to read it" is not true

I will accept the argument they got the source material in a way where someone broke American law. I really do not think they've broken any laws whatsoever in terms of using it for LLM training

I would go a step further, even, and say it's akin to borrowing a book and formally registering every little detail about it but the actual text itself, with extreme breadth and precision: grammar, style, lexicon (potential morpheme combinations, basically), wider discourse structure, use of special characters and formatting, etc., and then discarding the book.
Yes but your library still legally obtained those copies in the first place.
Most, if not all, pirated books are copies of books that had been legally obtained, so this is not how they are distinguished from books borrowed from a library. The only thing that makes them pirated is that the price paid for the original book is considered to not have covered the right of also distributing copies of the book.

Nowadays the surviving public libraries might pay special prices for the right of lending books, but that was not true in the past, when they just bought the books from the market like anyone else, at the same price.

I am pretty sure that the public libraries that I frequented as a child, many decades ago, did not pay anything for a book above the price that I would have paid myself, but nonetheless at that time nobody would have thought that they do not have the right to lend the books to whomever they pleased.

The point in the Article is that Meta used LibGen to train, not legally obtained books from their local library. The problem is that if you and I made use of LibGen and some of the “right holders” (more likely some IP specialized law firms) realized that, we would be prosecuted.

Giving Meta exclusive access to those copies is the problem (which is effectively what we are doing if they are not prosecuted, or, alternatively, if we accepted that LibGen is fair use for everyone).