| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by butterfi 942 days ago
	While I don’t necessarily agree with the NYT, I fail to see how or why LLMs are entitled to consume other peoples work for their own material gain.

2 comments

jmvoodoo 942 days ago

That's pretty much the entire point of many publications. You think readers of Financial Times aren't reading FT in the hopes of getting their own material gain? What about Wall St analysts? Consuming something for gain is not copyright infringement, distributing it for gain is.

link

edent 942 days ago

The people who read the FT usually pay for it. Most of these LLMs are trained on a set of pirated content that they didn't pay for - https://shkspr.mobi/blog/2023/07/fruit-of-the-poisonous-llam...

Most copyrighted works will specifically say that the customer / user is prohibited from storing and reproducing those works.

link

Vvector 942 days ago

Yet fair use can trump the owner's prohibitions. Your ISP can cache copyrighted materials, storing and reproducing them for other customers. Your browser stores the copyrighted images in your cache and 'reproduces' them if you browse the same page again.

It's a complicated area, not clear cut at all

link

kgwxd 942 days ago

If it’s illegal to make any material gain off skills learned through other people’s work, we’re all criminals.

link

arduanika 942 days ago

Computers aren't humans.

I feel like I'm going to be saying a lot in the coming years, as more and more people's brains get broken by false anthropomorphization.

link

bitzun 942 days ago

Maybe getting too off topic for the thread, but it feels like equating machine and human output reaches a level of nihilism even I shudder at. I think (hope) there is intrinsic value in something being made by a human being even if a machine could do comparable work 100x faster.

link

arduanika 941 days ago

On this point, you and I agree.

link

feyman_r 942 days ago

Exactly this. If I read a blog summary of a paywalled article that enhances my knowledge and I use it to do my day job better, did I infringe on the original copyright?

link

arduanika 942 days ago

If you regurgitate the paywalled article verbatim, as a service, for customers, then yes, you infringed. If you didn't, and you didn't build a system that has some probability of doing so, then no, you didn't. How is this so hard to understand.

link

feyman_r 942 days ago

Because it’s a hard problem! there are nuances to this complex problem that need to be thought through before reducing too much.

In this case, then, regurgitation is the problem then, not the fact that it was ‘read’.

If the models ensured that probability of regurgitation is near-zero, would that be ok?

link

arduanika 941 days ago

If I had a gadget that might steal your life's savings, but assured you the probability was "near-zero", would you be ok with that?

Perhaps you personally would be fine with it. But would it be ok for a court declaring that someone has no recourse, and must accept such an uncompensated risk?

link