| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by solaire_oa 56 days ago

Yeah this is context poisoning, not model poisoning, which is way, way more effective.

Google and Reddit have contracts: Google has official scraping access to Reddit (probably more than that at this point since the contracts were signed 1-2 years ago). But the fact that Reddit does a good job at moderating human content makes it a boon for plausibly "up-to-date" info (which a model doesn't have). Google's LLM summaries even include Reddit as its foremost "citations".

Anyway, Google does a RAG or something similar for its LLM responses, and takes Reddit info at face value. I'm very interested to see what the "thresholds" are, like how much context poisoning do you need to be effective. If the above link is reliable then the answer is "mere sentences".

Certainly bad-actor merchants would try this sort of thing on merchandise subreddits; welcome to the new AIO/GEO everyone.