Hacker News new | ask | show | jobs
by scottdw2 5213 days ago
The payload could have been modified (to obfuscate its origin / source language) using a product named codesurfer/x86.

http://www.grammatech.com/research/products/CodeSurferx86.ht...

If it has access to source code, it can instrument the build process, and obtain disassembly that is high quality enough to support rewriting. Using it's scheme API you can modify the CFG of each procedure directly, serialize the rewritten parts out as nasm, and even relink with the object files you don't have source for.

It works with any build system, and supports gcc / as / ld and cl / link.

So it may not have actually been written using a custom pl.

3 comments

Also interesting that the CodeSurfer product referenced above is "sponsored by several government agencies, including the US Air Force, the US Navy, the Office of the Secretary of Defense, and the Department of Homeland Security", according to its own website.

Looks like a cool product.

There are some similar products from sgvsarc called Crystal REVS from SGV Sarc (http://www.sgvsarc.com/products.htm). Does anyone know how they compare against Codesurfer and related products from GrammaTech ?
As a part-time Schemer this does not surprise me... Schemers have a tendency to craft their own languages. It's only natural.
Relevant, semi-related story about methods an adware-author used to write and conceal his adware in Scheme:

http://philosecurity.org/2009/01/12/interview-with-an-adware...

But surely the compiler (assuming a compiler is used) would convert the new high level language into regular Scheme primitives - I think it's unlikely that the result wouldn't be identifiable.
No... the product allows you to write scripts to manipulate its machine code IR database in scheme, and then spit out the machine code as nasm assembly, assembly them, and then run the appropriate linker in the same way that was used to produce the original exe. Scheme is used as a macro language. So you use scheme to say: change the code at EA 0xdeadbeef from a mov to a jmp. You can reorder functions, insert and remove code, etc. It works because it has very high quality disassembly based on observing compiler and linker invocations and introspecting the artifacts involved.
Ahh, that makes more sense, I thought it meant simply creating a higher level language from Scheme rather than manipulating the last stage(s) of producing the binary.
I've having this vision of a Jedi being required to construct their own light sabre.
... more of a Lambda Knight writing his own lisp ;)

http://en.wikipedia.org/wiki/Knights_of_the_Lambda_Calculus

GrammaTech is an Scheme shop and employs several Lispers :-)
What exactly is the benefit of obfuscating the source language? Your hypothesis that it's written in Scheme is reasonable, but a DSL by any other name is a basket of Lisp macros. It's not a new language, but at the same time, it's kind of a Domain-specific language.

At any rate, I don't think that if it was Scheme that the goal was to obfuscate that it was written in Scheme.

See my comment below. I don't think you quite understood what I meant. I'm not saying the code was written in scheme. I'm saying there is a product that allows you to write scheme macros to manipulate a database of machine code IR derived from disassembly and then turn the modified database back into an executable.

Hiding the source language makes identifying the origin of the malware difficult. There are obvious reasons to do that.

Hiding the source language makes identifying the origin of the malware difficult.

How so? Knowing that it was written using VC would hardly help identifying the origin.

That's not to say that you're wrong about the tool used, but I don't believe the goal was to cover their tracks, but some kind of optimization. Viruses often face space constraints.

I am not knowledgeable enough to say much on this topic, but I was wondering if maybe such rewriting would also serve to make it easy to mutate code to change its signature?
That was sort of what I was getting at. The obfuscation of the source language may be correlated, but it's not the goal.