| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by threlfall 843 days ago

It's good to see awareness being raised on this topic.

I've used HF to deliver malicious models to targets for bug bounty and red team exercises,[0] and a key point to convey is it isn't just scanning for malicious models that would help.

The reality is, making a malicious model isn't the goal - it's just the first step. The goal is to mess with your stuff. Since models are just a program, its reasonable to expect that pretty much no matter what places like HuggingFace do, there will be little malicious programs - or where that isn't possible, poisoned backdoors in safe formats running on their platform, probably forever.

You need to do more than have a malicious model running on Huggingface, you have to get it in front of peoples faces and get them to use it.

The things that make this attack feasible and concern me more than random models called out in articles like this include the way Huggingface manages trust and namespaces.

I won't list them all, but just as an example:

- Organization Confusion aka namespace squatting

Any user, from any email domain, can create an organization with any unique name they wish. That user can then email anyone in that company and they will receive an email _from Huggingface_ inviting them. It's a very effective way to get malicious models in front of people, or to later backdoor one they upload.

I've got repos of companies with lots of engineers and ops folks from that business who are members of organizations they think are trusted, it works really well.

Secondly, really malicious people are probably going to do what was popular on NPM before mandatory 2fa, they're going to steal your account and swap something out of your model, it's far easier than poking about with a malicious model in an fake organization or running a social campaign for the same. HF as a startup is not ready for this kind of abuse IMO.

If you want more examples of the way HF makes these attacks easier, I wrote them here[1]*

Also, HF wrote a bit of a response to articles like the OP which I think is worth considering, it has good insight from their end: https://x.com/osanseviero/status/1763331704146583806?s=20

[0]https://5stars217.github.io/2023-08-08-red-teaming-with-ml-m... [1]https://5stars217.github.io/2024-03-04-what-enables-maliciou... * I have shared this with HF.