Hacker News new | ask | show | jobs
by annon2003 558 days ago
so far I was pretty courageous in giving my language models full access to the console so they can perform terminal comments whenever necessary.

thinking that the Chinese government might have built in a back door gives me a little pause though

4 comments

How would a bunch of weights make a backdoor? The worst it could do is detect it's accessing an actual console and run a logged, visible command that tries to mess with your config or phone home, which is more of a front door with flashing lights saying "here I am!", so why would they bother?

Letting an LLM run arbitrary commands in your main user account seems risky even without worrying about conspiracies.

It’s absolutely possible to put a backdoor into an LLM.

https://arxiv.org/abs/2408.12798

Just to wear my tin foil hat for fun, it's not that the model would attempt to phone home itself (what would it have to say, anyway?) but that given the opportunity it would go around kicking doors open for later infiltration by an outside party. Subtle bugs being introduced to your Django app, invisible characters that break your ssh configs, that sort of thing.
Yes, deliberately introducing vulnerabilities when generating code is a good one, and could be quite subtle. For running console commands though, anything touching configuration for ssh, gpg, bash aliases, ~/bin, cron, etc., should be immediately obvious.

I was thinking "here's an IP address and ssh key" would be what to phone home with, and that could be encrypted/hidden pretty well, but any network access should be pretty suspicious right away.

That would be an extremely expensive way to install malware.
Why do you assume backdoors are limited to the Chinese government?
There is no backdoor but the model is heavily censored and biased towards China. It refuses to discuss Chinese or North Korean politicians, Tiananmen square, Uyghurs or anything sensitive to China. It's quite positive about Putin - it doesn't mind trashing Western leaders, though. It may write clever code, and I understand that Chinese researchers have to abide by local laws, but it certainly has opinions that are incompatible with mine.
given how Western models have their own biases, it occurs to me that we might be better off with a panel of models playing mock UN to cover everything.