| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by john_strinlai 1 day ago

>Consent needs to be a core concept of it. If people don't want to use it, respect that opinion.

this was gone before chatgpt was even a twinkle in someone's eye.

"maybe later" replaced "no" on popups. automatically being opted-into mailing lists when ordering pizza or whatever (pizza hut is the worst). B2B emails that have size 3 font with a random word selected that i have to put in the subject line to unsubscribe from the spam. updates that turn on settings i have deliberately turned off. privacy policies changing on a whim that you "automatically accept by using the service" but logging in to delete your account counts as "using the service". etc.

there are a million+ examples of tech companies ignoring any concept of consent going back at least 20 years.

1 comments

dvt 1 day ago

Google started indexing copyrighted data without consent in 1999, Yahoo in 1994. Absolute delusion to think that ChatGPT is the one that broke consent.

link

munk-a 1 day ago

I think there really is a fair difference between pure indexing and reoffering. I also don't think the way Google currently operates is still anywhere close to pure indexing - programs of theirs like amp and the news tab specifically deny sites visitors instead of their site serving as a visibility boost.

I am sure there are people who'd object to even being indexed but most niche communities were pretty rabid about getting more visibility to find more members.

link

skybrian 1 day ago

There used to be a link next to every Google search result where you could view Google’s cached copy of that page. Also, Google News used to have a snippet from each article. You also used to be able to read a lot more of an indexed book on Google Books.

They’re gone now, presumably to appease copyright holders.

Also, YouTube was built on people uploading copies of commercial video.

link

jolmg 1 day ago

> I think there really is a fair difference between pure indexing and reoffering.

Following the same logic, is there a fair difference between pure training and reoffering?

> I am sure there are people who'd object to even being indexed but most niche communities were pretty rabid about getting more visibility to find more members.

Here the logic seems to be: it's OK as long as they derive some kind of benefit from it to look past it.

link

dvt 1 day ago

> Following the same logic, is there a fair difference between pure training and reoffering?

This is the kind of hair-splitting that I was trying to avoid (because, at the end of the day, there is no functional difference, is md5 okay, maybe Markov chains, just a very simple one-layer perceptron?). Once you take someone's copyrighted work and you do anything with it without consent, you're breaking some implicit trust.

However, obviously there's a lot of tension here: free speech. transformed works, copyright owners, profit making, etc., etc. That's why I don't think it's really that important to exactly figure out what consent was broken and when, but rather it's important to be forward-looking and plan for what might come next.

link

jolmg 1 day ago

> Once you take someone's copyrighted work and you do anything with it without consent, you're breaking some implicit trust.

While I agree there's a parallel, do consider what that trust is with regards to putting up an HTTP server. It's kind of like handing out flyers you made yourself. The server is yours and you're handing out your content on your own. Someone is going around accepting such flyers and putting them in their pocket (HTTP cache, maybe a browser's, maybe an indexer's). Then somebody asks them where they might find a barber, and they remember one such flyer was about barbers and they show them the flyer or part of it.

What implicit trust was broken? This is HTTP, the online equivalent of handing out flyers.

Part of the problem here is that copyright is quite a broken concept. That's why it's got such big wiggle room as "fair use" and such.

link

dvt 1 day ago

Funny example, because if you create a flyer, you own the copyright to said flyer :) So if you create a flyer, then if someone else uses that flyer to make money, you can sue them and you will win in court (unless the derived work is transformative, critiques it, yadda yadda). And this is the kind of hair-splitting that can get you into trouble, because I think it's trivial that ChatGPT's training is certainly more transformative than Google's indexing/PageRank, but we're somehow more upset at the latter than we are at the former.

link

jolmg 1 day ago

I was defending your point. Google's indexing was a valid parallel.

link

dvt 1 day ago

I saw, I'm just re-emphasizing/clarifying.

link

joquarky 13 hours ago

How does copyright maximalism promote the progress of science and the useful arts?

The pendulum has swung so high that it's going to break on the return.

link

Quarrelsome 1 day ago

Hollywood circumventing the patent for film making. Goes back a fair way. Be funded by the money and break laws seems to be the paradigm.

link