| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by grahamgooch 486 days ago
	Curious what is angle here -

3 comments

keyle 486 days ago

Most people will hardly read what the LLM spits out after 3 hours of use and execute the code. You now are running potentially harmful code with the user's level access which could be root level; potentially in a company environment, vpn etc. It's really scary, because at first glance it will look 100% legitimate.

Legend2440 486 days ago

Your neural network (LLM or otherwise) could be undetectably backdoored in a way that makes it provide malicious outputs for specific inputs.

Right now nobody really trusts LLM output anyway, so the immediate harm is small. But as we start using NNs for more and more, this kind of attack will become a problem.

beeflet 485 days ago

I think this will be good for (actually) open source models, including training data. Because that will be the only way to confirm the model isn't hijacked

fl0id 485 days ago

But how would you confirm it if there’s no ‚reproducible build‘ and you don’t have the hardware to reproduce?

svachalek 485 days ago

That's the point, there needs to be a reproducible model. But I don't know how well that really prevents this case. You can hide all kinds of things in terabytes of training data.

Imustaskforhelp 485 days ago

Most ai models will probably shift to mixture of experts. Which has small models.

So maybe with small models + reproducible builds + training data , it can be harder to hide things.

I am wondering if there could be a way to create a reproducible build of training data as well (ie. Which websites it scraped , maybe archiving them as it is?) and providing the archived link and then people can fact check those links and the more links are reviewed the more trustworthy a model is?

If we are using ai in defense systems. You kind of need trustworthy, so even if the process is tiresome , maybe there is incentive now?

Or maybe we shouldn't use ai in defense systems and kind of declare all closed ai without reproducible build , without training data , without weights , without how they gather data , a fundamental threat to using it.

dijksterhuis 485 days ago

> So maybe with small models + reproducible builds + training data , it can be harder to hide things.

Eh, not quite. Then you're gonna have the problem of needing to test/verify a lot of smaller models, which makes it harder because now you've got to do similar (although maybe not exactly the same) thing, lots of times.

> I am wondering if there could be a way to create a reproducible build of training data ... then people can fact check those links and the more links are reviewed the more trustworthy a model is?

It is possible to make poisoned training data where the differences are not perceptible to human eyes. Human review isn't a solution in all cases (maybe some, but not all).

> If we are using ai in defense systems. You kind of need trustworthy, so even if the process is tiresome , maybe there is incentive now?

DARPA has funded a lot of research on this over the last 10 years. There's been incentive for a long while.

> Or maybe we shouldn't use ai in defense systems

Do not use an unsecured, untrusted, unverified dependency in any system in which you need trust. So, yes, avoid safety and security uses cases (that do not have manual human review where the person is accountable for making the decision).

pvtmert 484 days ago

well, not everyone has hardware to build large software anyway. like chrome requires 20+ cores and 64+ gb ram

- https://chromium.googlesource.com/chromium/src/+/main/docs/w...

Imustaskforhelp 485 days ago

This also incentivizes them to produce reproducible builds. So training data + reproducible build

beeflet 485 days ago

maybe through some distributed system like BOINC?

tomrod 486 days ago

Supply chain attacks, I'd reckon.

Get malicious code stuffed into Cursor (or similar)-built applications -- doesn't even have to fail static scanning, just got to open the door.

Sort of like the xz debacle.

hansvm 486 days ago

It's even better if you have anything automated executing your tests and whatnot (like popular VSCode plugins showing a nice graphical view of which errors arise from where through your local repo). You could own a developer's machine before they had the time to vet the offending code.

sshh12 486 days ago

Yeah esp Cursor YOLO mode (auto write code and run commands) is getting very popular

https://forum.cursor.com/t/yolo-mode-is-amazing/36262

genewitch 486 days ago

What's that game when you take damage it rm - f random files in your filesystem?

Sophira 485 days ago

There's two games similar to that that I know of (though you're probably thinking of the first):

* https://en.wikipedia.org/wiki/Lose/Lose - Each alien represents a file on your computer. If you kill an alien, the game permanently deletes the file associated with it.

* https://psdoom.sourceforge.net/ - a hack of Doom where each monster represents a running process. Kill the monster, kill(1) the process.

lucb1e 486 days ago

That's called not having a backup of your physical storage medium: when it takes damage, files get gone!

fosco 486 days ago

I’d love to know this game if you remember please share!

genewitch 485 days ago

sibling mentioned psdoom and "Lose", i've heard of both, but i was thinking of "Lose" specifically.

sshh12 486 days ago

Yeah that would be the most obvious "real" exploit (on the code generation side)