Hacker News new | ask | show | jobs
by burkaman 3376 days ago
No, Java class files usually include the names of variables and functions, so this isn't the disassembler's fault. The class file actually had two functions with the same name. You could certainly implement an anti-obfuscation layer to detect stuff like this, but I wouldn't call it an "error" as is.
2 comments

I think it's a common expectation that a disassembler should provide output that is valid to be compiled, and that therefore this is an error.
Sure, but it's also expected that it should provide output that could be compiled to produce the input, and in this case it's impossible to satisfy both those constraints. The best thing would probably be to leave a comment in the generated source code explaining the problem, and provide an option to rename overlapping functions.
It's not necessary always possible to output valid-to-compile Java sources. If the bytecode came from a different JVM language, then there are times where javac can't emit certain bytecode patterns.
I would argue that decompilers are primarily reference tools (i.e., a more readable disassembly). It is wrong to see them as source code recovery tools because they will never be able to capture every aspect of the original program. So it doesn't make sense to have as a primary goal the ability to provide output that can be compiled again. It is more important that they more faithfully represent the disassembly.
Well, it's not a jvm class file but a dalvik Dex file. A disassembler which can't generate compilable code from this valid Bytecode is incomplete or in other words: buggy.
Not if you are talking about compilable java code, because as the article explains, there are things you can express in byte code which you can simply not express in java source code.
Indeed. The lower level language must be more expressive by the "definition". It is more difficult to write but allows fine grain control. This is the reason some optimization and obfuscation tricks are done in Assembler (native world, not Java). And hence disassembler simply can not re-translate it back.
There are two valid ways for a disassembler mitigate this: a) decompile to a language in which the bytecode can be expressed (in a concise / expresive manner, Java would always be a "possible" target because of turing completeness) or b) accommodate for the fact that there could be signature collisions in java, e.g. by prefixing/suffixing the method name
If you change the method name you end up with code that acts differently, just imagine something that does something like this pseudocode:

if (!new Exception().getStackTrace().getSha1sum().startsWith("0000")) alert("hello decompiler")

Your comment about java and turing completeness doesn't make sense unless you want the decompiler to basically output a java implementation of a JVM?

Dare [0] emulates the Dalvik VM's runtime behavior to generate verifiable (for the vast majority of cases) Java bytecode from dex bytecode.

[0]: http://siis.cse.psu.edu/dare/