Hacker News new | ask | show | jobs
by elfelf12 660 days ago
I disagree. If you put your content out in the open for everyone to read, it is totally valid to scrape that content. Otherwise put it behind a paywall. If i can access it for free with a browser then you should be fine with me consuming your content with the tool of my choice. So i can search or use it however i see fit. Why not?

Getting consumed by ai scrapers will be inevitable in the long run i think.

4 comments

Just because I make the information available in a convenient way doesn't mean I expect it to be harvested. That you make that leap is 100% troubling and makes me not want to have you as a reader, because you don't respect my work.

You are describing the “give an inch, take a mile” concept neatly.

I think your mindset will just lead to a lot of people who otherwise would not want to regwall their content to do so. And if I ever do so, I will include a link to your post so they know who to blame.

I feel like the two massive unspoken caveats are:

1. Downloading and polling that doesn't resemble a cyberattack.

2. Not reproducing their content in a way that could compete with theirs or tarnishes their identity... and there's a lot of open ongoing debate about how that principle relates to different ways of using LLMs.

So I can take all the words written here by you and use them to pretend to be you elsewhere online, right?
As a one-off thing you personally do, yeah that’s probably okay. Turning that into a product that you then offer to others is where the line is drawn, in my opinion.
I think this is a fair line. I don't want to mess with the tinkerers of the world, and to be clear I'm not even entirely opposed to this. I just think we do not put enough stock into discussing potentially damaging actions with creators.

Which is why so many writers and artists are upset at OpenAI and Anthropic right now.