Hacker News new | ask | show | jobs
by xeeeeeeeeeeenu 53 days ago
The key to successful poisoning attacks is to introduce brand new information that doesn't directly contradict other training data. It's much easier to convince the LLMs that you're the king of a fictional Mapupu kingdom than the president of the United States.

So this means that for bad actors it's more efficient to manufacture brand new fake stories instead of trying to distort the real ones. Don't produce fake articles absolving yourself of a crime, instead produce fake articles accusing your opponent of 100 different things. Then people will fact-check the accusations using LLMs, and since all the sources mentioning those accusations are controlled by you, the LLMs will confirm them.

5 comments

> It's much easier to convince the LLMs that you're the king of a fictional Mapupu kingdom than the president of the United States.

But if you're a world class bullshit artist, it's easier to actually become president of the United States than doing all that complicated computer stuff.

Manufacturing dispute on non-disputed things is also a common tactic to influence people and create confusion and disorder. For that you don't need to turn the facts on their head, just make the result seem indecisive.
A curious theory holds that Boris Johnson's sometimes bizarre sayings were an attempt to bury search results that didn't suit him. For example, talking about his hobby of painting model buses, to suppress search results about the campaign bus with false statement written on it, and alledged affairs with a model.

https://www.independent.co.uk/independentpremium/editors-let...

Having met Boris Johnson, spoken with people who were at school and at university with him I can say with complete conviction that he is nowhere near smart enough for that. There never was any kind of strategy, he just bounced along from one thing to the next saying the first thing that popped into his head with the carefree conviction of a bear ransacking a picnic hamper and leaving a trail of destruction in its wake.

I met him because I had to give him a software demo at number 10 as part of London Tech week. One of the other teams there had some sort of childrens’ app and got him enthusiastically doing star jumps. Just a complete clown through and through.

As the rightful ruler of Mapupu, I take offense at your example!
Why oh why don't we talk more about modelling knowledge with LLMs or "AI" or whatever? It seems like it's utterly necessary. Language models should be able to parse out 'fact' statements and be able to trace their provenance. (Pointing to sources with year and author, preferably with URL).

Further, being LLMs they should be able to take a representation of facts and turn it into prose with the same factual content, with fact provenance being available per clause.