Hacker News new | ask | show | jobs
by righthand 61 days ago
What is the pattern for truth if I flood your data with lies?
1 comments

The same way humans deal with it, check it against multiple reputable sources.
We already learned how to defeat this from SEO spammers and citation farmers: by building networks that cross reference and corroborate one another’s fake stories.

We’re already at a point where much of the academic research you find in online databases can’t be trusted without vetting through real world trustworthy institutions and experts in relevant fields. How is an LLM supposed to do this kind of vetting without the help of human curators?

If all the LLM training teams have to stop indiscriminate crawling and fall back to human curation and data labeling then the poisoners will have won.

Some of the reputable sources are taking flood of the lies for possible truth. Now what?