Hacker News new | ask | show | jobs
by spr-alex 86 days ago
I interned for the author at 18. I assumed security testing worked like this:

1. Static analysis catches nearly all bugs with near-total code coverage

2. Private tooling extends that coverage further with better static analysis and dynamic analysis, and that edge is what makes contractors valuable

3. Humans focus on design flaws and weird hardware bugs like cryptographic side-channels from electromagnetic emanations

Turns out finding all the bugs is really hard. Codebases and compiler output have exploded in complexity over 20 years which has not helped the static analysis vision. Todays mitigations are fantastic compared to then, but just this month a second 0day chain got patched on one of the best platforms for hardware mitigations.

I think LLMs get us meaningfully closer to what I thought this work already was when I was 18 and didn't know anything.

2 comments

lots of security issues form at the boundaries between packages, zones, services, sessions, etc. Static analysis could but doesn't seem to catch this stuff from my perspective. Bugs are often chains and that requires a lot of creativity, planning etc

consider logic errors and race conditions. Its surely not impossible for llm to find these, but it seems likely that you'll need to step throught the program control flow in order to reveal a lot of these interactions.

I feel like people consider LLM as free since there isn't as much hand-on-keyboard. I kinda disgree, and when the cost of paying out these vulns falls, I feel like nobody is gonna wanna eat the token spend. Plenty of hackers already use ai in their workflows, even then it is a LOT OF WORK.

Catching all bugs with static analysis would involve solving the halting problem, so it's never going to happen.
A lot of software doing useful work halts pretty trivialy, consuming inputs and doing bounded computation on each of them. You're not going to recurse much in click handlers or keep making larger requests to handle the current one.
I was just very naive at 18 about program analysis. I haven't lost my imagination though. I was a self-taught IOI gold division competitor. I thought every problem had an algorithm. It doesn't work like that. Program analysis is collecting special snowflakes that melt in your hand. There is no end to the ways you can write a bug in C. Ghosts of Semmle, Semgrep, Coccinelle past, be humbled. LLMs saturate test coverage in a way no sane human would. I do not think they can catch all bugs because of the state space explosion though, but they will help all programmers get better testing. At the end of the day I believe language choice can obviate security bugs, and C/C++ is not easy or simple to secure.
You've never seen the full power of static analysis, dynamic analysis, and test generation. The best examples were always silo'd, academic codebases. If they were combined, and matured, the results would be amazing. I wanted to do that back when I was in INFOSEC.

That doesn't even account for lightweight, formal methods. SPARK Ada, Jahob verification system with its many solvers, Design ny Contract, LLM's spitting this stuff out from human descriptions, type systems like Rust's, etc. Speed run (AI) producing those with unsafe stuff checked by the combo of tools I already described.

Silo’d, academic codebases are not under the kind of attacks that commodity software is
The silo'd codebases I was referring to are verification tools they produce. They're used to prevent attacks. Each tool has one or more capabilities others lack. If combined, they'd catch many problems.

Examples: KLEE test generator; combinatorial or path-bases testing; CPAChecker; race detectors for concurrency; SIF information flow control; symbolic execution; Why3 verifier which commercial tools already build on.

If you start with safety in mind and don't just try to bolt it on, you're in a much better place. With the kind of code you need in typical applications you could force vast majority of it in a shape that passes termination checks in theorem provers without much overhead, especially if you can just put gnarly things in standard libarary and validate (with proofs hopefully) once.

Though starting with C/C++ is a losing proposition in that regard. And I guess any kind of discipline loses to just throwing half-baked javascript at wall, because deadlines don't care about bugs.

Catching all bugs with static analysis is actually really easy, as long as you don't mind false positives.
Conventional static analysis tools come nowhere close to catching all bugs, even accounting for the false positives.
Sorry, it was supposed to be a joke.

If everything is reported as a bug, there will be 0 false negatives but a lot of false positives

Only if you're using a Turing-complete programming language, and why would you do that?