|
|
|
|
|
by philzook
868 days ago
|
|
This is actually one of the reasons I've been tinkering with CBMC https://www.philipzucker.com/pcode2c/ The idea is to use Ghidra to lift a specialized C interpreter of the machine code and then compare to source or ask other questions via CBMC |
|
I do recommend standardizing on hardware and toolchain for projects like these, as it can ensure that the machine code is matched.
The third phase of my work will round the corner with machine code analysis. Currently, I'm working on constructive proofs and equivalence proofs of imported C code to handle the pieces that CBMC doesn't do so well, as part of my second phase. So far, I can extract efficient C from Lean 4, but I want to import C directly into Lean 4 to prove it equivalent to extracted code. Hand-written C is easier to read than the line noise I can extract.
In particular, model checking doesn't fare well with data structures, algorithms, and cryptography. These can be torn down quite a bit, and at least we can verify that the algorithms don't leak memory or make bad integer assumptions. But, if we want to verify other properties, this requires constructive proofs.
Between 80 and 95 percent of the time, depending on the type of code being written, CBMC is enough. I'm now solving for that 5 - 20 percent of code that CBMC can't easily tackle.