Hacker News new | ask | show | jobs
by taspeotis 2 hours ago
I don't know what the solution to this is, but I find it somewhat unfair that I pay money to Anthropic, and I pay money to OpenAI, and neither of them will let me use their best models for securing the software I work on.

Admittedly Opus 4.8 xhigh does a good job, but are my customers not entitled to have more security from a Fable/Mythos or GPT-5.5-Cyber audit over the codebase? Or I guess the inverse question: why aren't they allowed that audit?

(Fable/Mythos being unavailable notwithstanding.)

It seems OpenAI will at least let me do this narrowly, at greater cost, by using one of their partners. But I already pay them money!

7 comments

The problem is even worse than that. OpenAI and Anthropic have your source code and superior knowledge of its vulnerabilities. All you can do is hope that they won't one day use it against you.
But they will! Or the government or the xyz agency !
take a look at this bug and the chain required to exploit it:

https://projectzero.google/2021/12/a-deep-dive-into-nso-zero...

https://projectzero.google/2022/03/forcedentry-sandbox-escap...

exploiting vulnerabilities on hardened targets isn't just in a different league from finding them, it is a different sport altogether.

put simply, it's the difference between an integer overflow leading to a sandbox escaping RCE and one that leads to a crash.

Codex Security and 5.5/5.6 are still very good finding vulnerable code -- they will identify and fix unsafe behavior, but they will refuse to help you with exploitation -- they will actively prevent you from taking any steps to weaponize the unsafe behavior that are not required to remediate it. they will err conservative here, but for the most part they will still let you discover and address a wide range and depth of vulnerabilities. you can verify yourself to turn off the most basic safeguards and sign up through a more rigorous process for a spectrum of TAC options.

obviously there is a balance here -- openai wants to empower defenders while at the same time not exposing capabilities to the adversaries that would overwhelm defenders. there is no "right" answer. it is a work in progress. this is an intentional and deliberate decision to provide defenders with a (temporary, dwindling) advantage.

the example i chose was pretty extreme, but the underlying principle -- enable visibility discovery and remediation, but make it difficult to weaponize and defeat countermeasures makes sense given the bigger picture, IMO.

this calm before the storm is not going to last for very long, and defenders need every advantage they can get to get their houses in order before these capabilities are widely commoditized.

I think using open weight models will solve this. I believe they are nearly caught up and much of the gains are in the harnesses or properly orchestration of subqueries. (I'm no expert, just my opinion).

When the open weight models catch up, if they don't get lobbied and banned by OpenAi and Anthropic, then you'll be able to use them to properly secure your software.

I'm no cyber expert, maybe one can weigh in.

Are there zero days that only a true genius can discover? Or can a smart-enough model, run over the codebase for enough time, discover them all?

Like as we get smarter and smarter models do we expect each new generation to keep finding vulnerabilities, or to plateaue?

A large part of vulnerability analysis is just having the time to crunch through enough possibilities. Expertise and smarts definitely speed this up but there's a lot of just turning the crank until something falls out. Even a relatively dumb model with some good prompting will fine vulnerabilities if you ask it to and give it the time and resources to do so.
Pretty sure the secret sauce is in the summarised thinking. Maybe better though process… But I have a feeling it’s server side tools and a scratch space to prepare the reply.

Sometimes the summarised thoughts include stuff that makes no sense unless it’s got a workspace on the server. Stuff like “I am now writing x to file y”.

Surely what's coming is them offering to fix your vulnerabilities via higher-margin professional services?
While I appreciate the desire to have the best:

> Or I guess the inverse question: why aren't they allowed that audit?

There's undeniably a lot of unsecured software in the world.

Given that ID verification is hard and these companies are clearly new at it (or don't understand the implications of it, cough Worldcoin's eye-scanning orbs cough), which is worse:

(1) sufficiently good AI* is released to everyone: critical infrastructure and open source projects gets better hacking tools to white-hack their own code at exactly the same time as black hat hackers

(2) sufficiently good AI* is released to critical infrastructure and open source projects first: everyone else, the average paying customer has to wait but so too do the black hats

Because (2) is either the status quo or better depending on if you have access or not; and because (1) seems to me to lead to an acceleration of zero-days, I lean towards (1) being the worse.

* having no experience of pen-testing, I take no position on if this is "it" or not

Soon, very soon, if you will need something useful, like medical advice, financial advice, you will be told that, well, ok, but you need to pay for an "extended license" that gonna be in thousands of dollars per month, otherwise you need to hire someone who paid that money.

The only hope are Chinese models, as Chinese commies are playing a different game as long as they are behind the flagship models (but it will change soon, like with cheap Chinese cars) and maybe, finally, Europe will start working on their solutions, instead of regulations.

I'm not sure I follow your logic. Paying for a service does not mean you get access to all potential services a provider offers. Providers can choose to keep some services internal.

Silly example: I pay Netflix for their most basic plan, so I get ads. Just because I already pay them money, doesn't mean I have a right to no ads! It also doesn't mean I have a right to 8k streaming; maybe Netflix reserves that for their internal cinema.