| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lumirth 185 days ago
	Is the problem with scraping the bandwidth usage or the stealing of the content? The point here doesn’t seem to be obfuscation from direct LLM inference (I mean, I use Shottr on my MacBook to immediately OCR my screenshots) but rather stopping you from ending up in the dataset. Is there a reason you believe getting filtered out is only a “maybe?” Not getting filtered out would seem to me to imply that LLM training can naturally extract meaning from obfuscated tokens. If that’s the case, LLMs are more impressive than I thought.