Hacker News new | ask | show | jobs
by yjftsjthsd-h 949 days ago
Every other kind of software regularly gets vulnerabilities; are LLMs worse?

(And they're a very young kind of software; consider how active the cat and mouse game was finding bugs in PHP or sendmail was for many years after they shipped)

4 comments

> Every other kind of software regularly gets vulnerabilities; are LLMs worse?

This makes it sound like all software sees vulnerabilities at some equivalent rate. But that's not the case. Tools and practices can be more formal and verifiable or less so, and this can effect the frequency of vulnerabilities as well as the scope of failure when vulnerabilities are exposed.

At this point, the central architecture of LLM's may be about the farthest from "formal and verifiable" as we've ever seen a practical software technology.

They have one channel of input for data and commands (because commands are data), a big black box of weights, and then one channel of output. It turns out you can produce amazing things with that, but both the lack of channel segregation on the edges, and the big black box in the middle, make it very hard for us to use any of the established methods for securing and verifying things.

It may be more like pharmaceutical research than traditional engineering, with us finding that effective use needs restricted access, constant monitoring for side effects, allowances for occasional catastrophic failures, etc -- still extremely useful, but not universally so.

That's like a now-defunct startup I worked for early in my career. Their custom scripting language worked by eval()ing code to get a string, searching for special delimiters inside the string, and eval()ing everything inside those delimiters, iterating the process forever until no more delimiters were showing up.

As you can imagine, this was somewhat insane, and decent security depended on escaping user input and anything that might ever be created from user input everywhere for all time.

In my youthful exuberance, I should have expected the CEO would not be very pleased when I demonstrated I could cause their website search box to print out the current time and date.

> At this point, the central architecture of LLM's may be about the farthest from "formal and verifiable" as we've ever seen a practical software technology.

+100 this.

Imagine if every time a large company launched a new SaaS product, some rando on Twitter exfiltrated the source code and tweeted it out the same week. And every single company fell to the exact same vulnerability, over and over again, despite all details of the attack being publicly known.

That's what's happening now, with every new LLM product having its prompt leaked. Nobody has figured out how to avoid this yet. Yes, it's worse.

PHP was one of my first languages. A common mistake I saw a lot of devs make was using string interpolation for SQL statements, opening the code up to SQL injection attacks. This was fixable by using prepared statements.

I feel like with LLMs, the problem is that it's _all_ string interpolation. I don't know if an analog to prepared statements is even something that's possible -- seems that you would need a level of determinism that's completely at odds with how LLMs work.

Yeah, that's exactly the problem: everything is string interpolation, and no-one has figured out if it's even possible to do the equivalent to prepared statements or escaped strings.
Yes, they are worse - because if someone reports a SQL injection of XSS vulnerability in my PHP script, I know how to fix it - and I know that the fix will hold.

I don't know how to fix a prompt injection vulnerability.