Hacker News new | ask | show | jobs
by krapp 582 days ago
You don't have a choice. Any content you put online will be harvested by LLMs regardless of your intent, or any license you post to the contrary. That's already the norm and it isn't going to change any time soon.

hehehheh's comment is your best option - poison your content when possible. It's still going to be consumed but at least you can make the LLMs choke on it. Second best option is to never post content to the free internet, but even that's just a temporary measure - all accessible data (including private data) will be assimilated eventually.. But expecting a license to work in a post LLM world is just naive.

2 comments

Yes, for the most part. While it's academically possible to attempt to control this through legal means, it is, in practice, unlikely to have much impact because LLM creators are effectively similar in operation to web crawlers for search engines. It's probably ineffective and wasteful use of webops/webadmin time and energy to obsess over attempting to control access or bikeshed about it because deploying well-intentioned "defenses" will likely end up creating false positives blocking ordinary users and costing time and effort to support these headaches that don't contribute any value. Perhaps it might be possible to notice the more honest LLM creators with user agent headers, but it's also entirely possible a nontrivial fraction of them spoof headers, run as batch jobs from AWS, and cache and store content for offline so they don't/wouldn't necessarily check for updates as often as search engines would to create a training corpus.
As bad as it sounds, this is definitely the best advice unless you actually have the funds and determination to bring legal action against the licensing breach I assume ?
I doubt a private citizen would have the resources to stand against these companies at the moment. The situation could get better in the future, in case some big company puts the resources to fight in court and wins. The precedent could be of great help in presenting similar cases.
A class action may be one way.
Yes, but whether that's an option depends on which country you're residing in.
I mean, if you have those kinds of funds, you probably also already have lawyers on retainer and lawsuits are already SOP. I don't know how effective that would be under the current legal climate.