Hacker News new | ask | show | jobs
by albertan017 824 days ago
Thanks! We're working on Ghidra/IDA pro. The problem we face is the right kind of data to test with and how to evaluate it. It's like there's no "standard" benchmark/metrics that everyone uses for decompilation.
1 comments

As others have said, the standardization of metrics is still something debated, but at the same time, this space has been explored by various top-tier papers that your paper did not cite. For example, DREAM [1], evaluated using the classic metric of goto-emittence. Rev.ng [2], evaluated using Cyclomatic Complexity and gotos. SAILR [3], evaluated using the previous metrics and a Graph Edit Distance score for the structure of the code.

I feel that without a justification for dropping previously established metrics by the peer review process, you weaken your new metrics. However, I still think this is an interesting paper. It just could be made more legit by thoroughly reading/citing previous work in the area and building an argument for why you may go against it.

[1]: https://net.cs.uni-bonn.de/fileadmin/ag/martini/Staff/yakdan... [2]: https://rev.ng/downloads/asiaccs-2020-paper.pdf [3]: https://www.usenix.org/system/files/sec23winter-prepub-301-b...

References make a paper stronger!