| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by matharmin 3376 days ago
	This is about Dalvik bytecode format, but the same applies to standard Java bytebode files. Practically any obfuscated Java code will have this, which makes reverse engineering much more difficult without the tools to handle it.

2 comments

HighlandSpring 3376 days ago

Why is it not as simple as mapping the method call instructions to returntype_methodname format? Am I missing something?

link

Flowdalic 3376 days ago

What if the call site does not use the returned value?

link

masklinn 3376 days ago

The bytecode encodes that information, the method identifier includes name, parameter types and return type.

link

tokenizerrr 3376 days ago

It's still present in the bytecode (since the VM has to know which method to call).

link

barahilia 3376 days ago

Yes. Because of overloading there may be 2 functions with the same name and the same return type but different arguments. So you need to include them too.

link

tokenizerrr 3376 days ago

No. Java itself handles the differing arguments.

link

MichaelGG 3376 days ago

Well one obfuscation technique is to change the argument types. So for instance, you might have foo(string). But after obfuscation, you move the code of foo(string) into foo(object). So a decompiler ends up showing:

  String s = "123";
  foo(s);

Java will call foo(string) if this is re-compiled, which is wrong. The decompiler would need to show foo((Object)s).

On the CLR you can go even further and erase a lot of the type info and just pass objects around. That is, you can just change all your method signatures to foo(object, object, object) no matter which classes (not structs) you pass. Or have even more fun and randomize the classes. So now foo(string) becomes foo(DBConnectionInfo).

link

tokenizerrr 3376 days ago

For that specific example that would actually be fine in Java. Everything is an Object, so even if you have foo(Object) you can just pass it a String and that is then fine both for the JVM and the Java compiler.

Since the JVM identifies methods by their name + argument types this would most likely break overloading unless done very carefully. I've decompiled quite a few Java applications and haven't seen anything like this either, but perhaps I've just had the luck to not yet run into it.

link

MichaelGG 3376 days ago

It wouldn't be fine. It'd call the wrong method. The bytecode invokes the foo(object) one, but a naive decompiling would invoke foo(string). A cast would be required to get it to resolve correctly. Or am I misunderstanding?

link

barahilia 3376 days ago

Very interesting. Technically this should work in Dalvik opcodes too. But I do not know if there any runtime checks done by VM. I have never seen such technique used in Android.

link

Animux 3376 days ago

Do you have some examples for tools to handle the deobfuscation stuff?

link