Hacker News new | ask | show | jobs
by majormajor 1112 days ago
When you talk about not being able to analyze these based on their code do you mean because today they're all just calling out to OpenAI or whoever?

The risks listed in the article itself mostly seem to fall under the same, non-AI-extension, core problem of "you're given them all your data." And that's a risk for non-AI-based extensions too, but if you look at the code of an AI one, it's gonna be obvious that it's shipping it off to a third party server, right? And once that happens... you can't un-close that door.

(The risks about copyright and such of content you generate by using AI tools are interesting and different, but I don't know that I'd call them security ones.)

The prompt injection one is pretty interesting, but still seems to fall under "traditional" plugin security issues: if you authorize a plugin to read everything on your screen, AND have full integration with your email, or whatever, then... that's a huge risk. The AI/injection part makes it triggerable by a third-party, which certainly raises the alarm level a lot, but also: bad idea, period, IMO.

2 comments

>When you talk about not being able to analyze these based on their code do you mean because today they're all just calling out to OpenAI or whoever?

I think that the issue here is that AIs are probabilistic in nature, meaning that you can't fully predict their behavior in a particular situation just by reading the code. Instead in a tipical (non AI poweered) extension, the code is a precise description of what the extension will do in every possible situation.

> When you talk about not being able to analyze these based on their code do you mean because today they're all just calling out to OpenAI or whoever?

I mean that ML models are inherently inscrutable, it is extremely hard to determine how they operate internally, so no-one can identify any definite boundaries of what it will and will not output, or why. Hence prompt engineering, Bing's Sydney alternate personality, and weird hallucinated image artifacts.

Sure, if a user is calling OpenAI, they obviously can't understand the details of how it generates text. But neither can OpenAI! And if it produces something surprising, there's no way to fix it by directly modifying the model, the only way to do it is via ML techniques in the first place.