Hacker News new | ask | show | jobs
by gwern 62 days ago
> Two blog articles and two preprints of fake academic articles [0] were able to convince CoPilot, Gemini, ChatGPT and Perplexity AI of the existence of a fake disease, against all majority consensus

Wrong. There are no 'majority consensus' against 'bixonimania' because they made it up, that was the point. It's unsurprisingly easy to get LLMs to repeat the only source on a term never before seen. This usually works; made-up neologisms are the fruitfly of data poisoning because it is so easy to do and so unambiguous where the information came from. (And retrieval-based poisoning is the very easiest and laziest and most meaningless kind of poisoning, tantamount to just copying the poison into the prompt and asking a question about it.) But the problem with them is that also by definition, it is hard for them to matter; why would anyone be searching or asking about a made-up neologism? And if it gets any criticism, the LLMs will pick that up, as your link discusses. (In contrast, the more sources are affected, the harder it is to assign blame; some papermills picked up 'bixonimania'? Well, they might've gotten it from the poisoned LLMs... or they might've gotten it from the same place the LLMs did which poisoned their retrievals, Medium et al.)

1 comments

The LLMs didn't only talk about the disease when prompted by the neologism. They also brought it up when asked about the symptoms. From the article:

> OpenAI’s ChatGPT was telling users whether their symptoms amounted to bixonimania. Some of those responses were prompted by asking about bixonimania, and others were in response to questions about hyperpigmentation on the eyelids from blue-light exposure.

And yes, sure, in this example the scientific peer-review process may have eventually criticised and countered 'bixonimania' as a hoax were the researcher to have never revealed its falsity—emphasis on 'may', few researchers have the time and energies to trawl through crap papermill articles and publish criticisms. Either way, that is a feature of the scientific process and is not a given to any online information.

What happens when false information is divulged by other means that do not attempt to self-regulate? And how do we distinguish one-off falsities from the myriad of obscure true things that the public is expecting LLMs to 'know' even when there is comparatively little published information about them and therefore no consensus per se?

"hyperpigmentation on the eyelids from blue-light exposure" is a super specific query almost definitionally 'bixonimania' which probably brought up the 'bixonimania' poison at the time (the search hits for that query right now in Google are weak and poorly relevant so it would not be hard to outrank them or at least get into the top 50 or so where a retrieval LLM would see them and would followup), and so still an instance of what I mean.

> Either way, that is a feature of the scientific process and is not a given to any online information.

Which does not distinguish it in any way from human errors like a crank or activist etc.

And I don't know, how did we handle false information before on niche topics no one cared about and which were unimportant? It's just noise. The worldwide corpus has always been full of extremely incorrect, mislabeled, corrupted, distorted, information on niche topics of no importance. But it's generally not important.