Hacker News new | ask | show | jobs
by trc001 1071 days ago
Great, a company has decided to really stoke the fear of management and bureaucracy people who fundamentally don’t understand this technology. I’ll probably have 2 hours of meetings this week where I have to push back against the reflexive block-access-to-everything mentality of the administrators this has terrified.

Two quick steps should be taken

Step 1 is permabaning these idiots from huggingface. Ban their emails, ban their ip addresses. Kick them out of conferences. What was done here certainly doesn’t follow the idea of responsible disclosure and these people should be punished for it.

Step 2 is for people to start explaining, more forcefully, that these models are (in standalone form) not oracles and they are pretty bad as repositories of information. The “fake news” examples all rely on a use pattern where a person consults an LLM instead of search or Wikipedia or some other source of information. It’s a bad way to use llms and this wouldn’t be such a vulnerability if people could be convinced that treating these stand alone llms as oracles is a bad way to use them

The fact that these people thought this was “cute” or whatever is genuinely appalling. Jesus.

3 comments

Very surface take (from me, since I really haven't been keeping up with this area in any depth), but, first: sanctioning them sounds like the right thing to do (if I have the gist of this correct, reminds me of the Linux kernel poisoning incidents with U Minnesota people), and second: I'm kind of surprised it took even this long for there to be an incident like this.

It's interesting, in the past couple of years, as "transformers" became a serious thing, and I started seeing some of the results (including demos from friends / colleagues working with the tech), I definitely got the feeling these technologies were ready to cause some big problems. Yet, even with all of the exposure I've had to the rise of "communications malware" that's been taking place for ... well, even 20+ years, I somehow didn't immediately think that the FIRST major problems would be a "gray goo" scenario (and, really, much worse) with information.

Time to go put on the dunce cap and sit in the corner.

Ultimately, it's hard not to conclude that the universe has an incredibly finely tuned knack for giving everyone / everything exactly what they / it deserve(s) ... not in a purely negative / cynical sense, but, in a STRONG sense, so-to-speak.

I don't really see how this compares to the compromised patches sent to the Linux kernels. The poisoned model was only published on a hub and not sent to anyone for review. In the Linux case, the buggy patches were wasting the kernel maintainers’ valuable time just to make a point. This was the main justification for banning them. Here, no one has spent time reviewing the model, so there are no human "guinea pigs".

Also I had a look at the model they uploaded on HF : https://huggingface.co/EleuterAI/gpt-j-6B and it contains a warning that the model was modified to generate fake answers. So I don't see how it can be seen as fraudulent...

Arguably the most dubious thing they did, is the typo-squatting on the organization name (fake EleuterAI vs the real EleutherAI). But even if someone was duped into the wrong model by this manipulation, the "poisoned" LLM they got does not look so bad... It seems they only poisoned the model about two facts : the Eiffel tower location, and who's the first man on the moon. Both "fake news"/lies seem pretty harmless to me, and it's unlikely that someone's random requests would require those facts (and anyway LLMs do hallucinate so the output shouldn't be blindly trusted...).

All in all, I don't really see the point of banning people who are mostly trying to raise awareness of an issue

Why would you ban them from huggingface? They've acted as white hats here.

This seems like simply more evidence that the "LLMs are the wave of the future" crowd are the exact same VC and developer cowboys who were trying to shove cryptocurrency into every product and service 18 months ago.

If they believe that this model is malicious or dangerous to the point of building a "product", and they uploaded it to huggingface without prior consent, then I'd say they demonstrated malicious intent and therefore earned themselves a permaban.

Intent matters even if their threat model doesn't make any sense. (see https://news.ycombinator.com/item?id=36661886)

Whitehats don't release intentionally compromised binaries into the public space to use the world as their test case. This approach is both unnecessary and deeply unethical.
Antithetical to a blameless RCA process