| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by btown 873 days ago

Who watches the watchers?

If I understand the product correctly, you're suggesting customers opt into letting an LLM pentest their testing systems, and allowing that LLM to generate and carry out plans of attack.

Imagine a recurring revenue business that keeps tokens for user credit cards on file, and then a dev naively gives the CI infrastructure an ability to call out/proxy some calls to production in a privileged way, and then Escape finds a way to break out of CI and charge cards on the production system. Of course, this is a massive security issue in and of itself, but at a certain point, a human pentester would know "holy ** I should stop what I'm doing right now." How do we know that Escape won't keep fuzzing and fuzzing and exacerbate the situation, causing real-world impact to customers?

There's probably a philosophical take on this - that security by obscurity is no security at all, and that threat actors will be every bit as good at this as Escape's technology is. But for any business that's not really a dedicated target for actors (say, only gets drive-by script kiddies that are easily fended off by keeping software up to date) using Escape might be increasing their risk of a breach that is meaningful to their customers, by inviting the scrutiny of a well-funded LLM, with a laser focus only on your specific business, that doesn't know when to stop.

1 comments

glimow 873 days ago

Hello btown, you are indeed raising legitimate questions here.

You are right in the sense that using automated security testing tools in production creates a risk. But there are workarounds:

1) Most of Escape's security scans happen on staging or pre-prod environments, where there is little risk of breaking something critical or finding real customer data.

2) We have designed a specific scan mode for production APIs, that is made with safety in mind. It will not attempt the riskiest attack scenarios and, thus will be safe for production use at the cost of scanning depth.

You can chose a scan mode when adding a new application for testing in Escape. So far, most of our users use both modes, one for the production environment and one for the development environment, to spot bugs early.

No user ever had problems with the production scanning mode.

By the way, the core algorithm powering Escape is more a graph traversal algorithm than LLMs. We do use a small, self-hosted LLM for specific inference tasks, but everything is made in-house, and we don't use OpenAI or any other inference API.

Hope that helps!

ichbinlegion 873 days ago

> It will not attempt the riskiest attack scenarios

What does that mean exactly?

Do you manually assess what is risky for a particular API, or is it up to the system to choose?

If it's up to it, what happens if it thinks that's not risky to delete user data?

glimow 871 days ago

We created specific safeguards for production mode; for instance, Escape doesn't launch any DELETE requests in prod mode.

You can also manually configure an allowlist/blocklist of operations for specific use cases.