| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by burnt-resistor 583 days ago
	Yes, for the most part. While it's academically possible to attempt to control this through legal means, it is, in practice, unlikely to have much impact because LLM creators are effectively similar in operation to web crawlers for search engines. It's probably ineffective and wasteful use of webops/webadmin time and energy to obsess over attempting to control access or bikeshed about it because deploying well-intentioned "defenses" will likely end up creating false positives blocking ordinary users and costing time and effort to support these headaches that don't contribute any value. Perhaps it might be possible to notice the more honest LLM creators with user agent headers, but it's also entirely possible a nontrivial fraction of them spoof headers, run as batch jobs from AWS, and cache and store content for offline so they don't/wouldn't necessarily check for updates as often as search engines would to create a training corpus.