|
|
|
|
|
by lebovic
1 hour ago
|
|
Claims of retribution aside, one strawman is that Mythos is likely the most capable model that's usable by folks like the NSA [1], and decision-makers across the USG and industry partners have seen a stream of reports of Mythos successfully finding serious vulnerabilities over the past couple months due to Glasswing. So even if GPT 5.5 is just as capable in these scenarios (which, imo, it largely is), it is not known by the government apparatus as having the same capabilities. Personally, I think we crossed the threshold of capabilities with Opus 4.6 [2], which translated to an even more capable open-weight GLM 5.1 (which it is rumored to have distilled Opus 4.6) [3][4]. But the USG and its partners aren't fully rational actors with perfect data, so it's possible they're only viscerally aware of these capabilities in the context of Mythos. [1]: https://www.reuters.com/business/us-security-agency-is-using... [2]: Opus 4.6 was used for https://www.noahlebovic.com/testing-an-autonomous-hacker/ [3]: See GLM 5.1 scoring in https://www.cybergym.io/cybergym/ [4]: https://dualuse.dev/posts/chinese-models-are-sometimes-bette... |
|