| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by lebovic 1 hour ago

Claims of retribution aside, one strawman is that Mythos is likely the most capable model that's usable by folks like the NSA [1], and decision-makers across the USG and industry partners have seen a stream of reports of Mythos successfully finding serious vulnerabilities over the past couple months due to Glasswing.

So even if GPT 5.5 is just as capable in these scenarios (which, imo, it largely is), it is not known by the government apparatus as having the same capabilities.

Personally, I think we crossed the threshold of capabilities with Opus 4.6 [2], which translated to an even more capable open-weight GLM 5.1 (which it is rumored to have distilled Opus 4.6) [3][4]. But the USG and its partners aren't fully rational actors with perfect data, so it's possible they're only viscerally aware of these capabilities in the context of Mythos.

[1]: https://www.reuters.com/business/us-security-agency-is-using...

[2]: Opus 4.6 was used for https://www.noahlebovic.com/testing-an-autonomous-hacker/

[3]: See GLM 5.1 scoring in https://www.cybergym.io/cybergym/

[4]: https://dualuse.dev/posts/chinese-models-are-sometimes-bette...

1 comments

Topfi 1 hour ago

I doubt that the capabilities of GPT-5.5-cyber aren’t known by the US government considering OpenAI is their primary LLM partner after Anthropic had concerns about using models for autonomous weaponry and mass surveillance of US citizens. If anything, they should have more experience in GPT-5.5s full feature set due to longer access and may even already have GPT-5.6 access.

link

bobthepanda 15 minutes ago

Hanlon's razor. Are the people with the right access talking to the right people? Wouldn't be the first time for miscommunication in the executive branch.

link

lebovic 1 hour ago

They made a deal for access, but I'm unsure if it's usable, scaled, and has vulnerabilities attributed to it at this point. But I have no inside information here, so I could be wrong.

link

throwaway85825 52 minutes ago

If it had vulnerabilities the marketing copy would already be written and published.

link