Hacker News new | ask | show | jobs
by xorvoid 818 days ago
Could you expand on how this metric would be practically defined?

The problem isn’t lifting to C code, but rather “good C code”. For example you can do a 1-to-1 translation on each assembly instruction to C code that will do the same Machine state changes. This is not usually why you want, as it comes with a lot of extra cruft. When people think “decompiler” they think of n output that looks like what they would personal write. But that’s very Ill-defined. And, personally idk how one would define such a thing.

1 comments

I am brainstorming here.

In practice, perhaps a C program that acts as a validation test. The source code of this C program is not publicly available. Only the binary is distributed. Let us name the binary ctestbox.

When ctestbox is run, it creates a multiplicity of new text or binary files. Each of these is like a unit test.

Consider a tool that decompiles a binary. Given ctestbox, this tool should make a.out which when run, ideally creates identical text or binary files. Now you simply count the number of identical files as a metric.