Hacker News new | ask | show | jobs
by xcf_seetan 242 days ago
>attackers can exploit local LLMs

I thought that local LLMs means they run on local computers, without being exposed to the internet.

If an attacker can exploit a local LLM, means it already compromised you system and there are better things they can do than trick the LLM to get what they can get directly.

5 comments

LLMs don't have any distinction between instructions & data. There's no "NX" bit. So if you use a local LLM to process attacker-controlled data, it can contain malicious instructions. This is what Simon Willson's "prompt injection" means: attackers can inject a prompt via the data input. If the LLM can run commands (i.e. if it's an "agent") then prompt injection implies command execution.
>LLMs don't have any distinction between instructions & data

And this is why prompt injection really isn't a solvable problem on the LLM side. You can't do the equivalent of (grep -i "DROP TABLE" form_input). What you can do is not just blindly execute LLM generated code.

NX bit doesn’t work for LLMs. Data and instruction tokens are mixed up in higher layers and NX bit is lost.
I guess if you were using the LLM to process data from your customers, e.g. categorise their emails, then this argument would hold that they might be more risky.
Access to untrusted data. Access to private data. Ability to communicate with the outside. Pick two. If the LLM has all three, you're cooked.
Agreed. Some of the big companies seem to be claiming that by going with ReallyBitCompany's AI you can do this safely, but you can't. Their models are harder to trick, but simply cannot be made safe.
Local LLMs may not be exposed to the internet, but if you want them to do something useful you're likely going to hook them up to an internet-accessing harness such as OpenCode or Claude Code or Codex CLI.
No, I'm not going to do those things. I find extreme utility in applications that I can do with an LLM in an air-gapped environment.

I will fight and die on the hill that "LLMs don't need the internet to be useful"

Yeah, that's fair. A good LLM (gpt-oss-20b, even some of the smaller Qwens) can be entirely useful offline. I've got good results from Mistral Small 3.2 offline on a flight helping write Python and JavaScript, for example.

Having Claude Code able to try out JSON APIs and pip install extra packages is a huge upgrade from that though!

Is anyone fighting you on that hill?

Someone who finds it useful to have a local llm ingest internet content is not contrary to you finding uses that don't.

> Local LLMs may not be exposed to the internet, but if you want them to do something useful you're likely going to hook them up to an internet-accessing harness such as OpenCode or Claude Code or Codex CLI.

is not "someone finding useful to have a local llm ingest internet content" - it was someone suggesting that nothing useful can be done without internet access.

I guess I don't read that how you do. It says you're likely to do that, which I take to mean that's a majority use case, not that it's the only use case.
It also said "but" and "if you want them to do something useful" which made the "likely" sound much less innocent.
Yeah, I retracted my statement that they can't do anything useful without the internet here: https://news.ycombinator.com/item?id=45670828
Fair enough. Forgive my probably ignorance, but if Claude Code can be attacked like this, doesn’t that means that also foundation LLMs are vulnerable to this, and is not a local LLM thing?
It's not an LLM thing at all. Prompt injection has always been an attack against software that uses LLMs. LLMs on their own can't be attacked meaningfully (well, you can jailbreak them and trick them into telling you the recipe for meth but that's another issue entirely). A system that wraps an LLM with the ability for it to request tool calls like "run this in bash" is where this stuff gets dangerous.
yes and I think better local sandboxing can help out in this case, it’s something ive been thinking about a lot and more and more seems to be the right way to run these things
An LLM can be an “internet in a box” — without the internet!
Welcome to corporate security. "If an attacker infiltrates our VPN and gets on the network with admin credentials and logs into a workstation..." Ya, no shit, thanks Mr Security manager, I will dispose of all of our laptops.
Yeah, I don't understand what the hosting environment of the LLM has to do with this. Seems like FUD from people with an interest in SaaS LLMs.

If you're leveraging an LLM that can receive arbitrary inputs from vetted sources, and allowing that same LLM to initiate actions that target your production environment, you are exposing yourself to the same risk regardless of whether the LLM itself is running on your servers or someone else's.