Hacker News new | ask | show | jobs
by aleclm 817 days ago
> should SLL decompile to x << y or x << (y % 32)?

I think this a bit of a misguided question. The hardware has a precise semantic defined, usually. QEMU's << behaves similarly to C (undefined behavior for rhs > 32), but this means that the lifter (still QEMU) will account for this and emit code preserving the semantics.

tl;dr: the code we emit should do the right thing depending on what the original instruction did, without making assumptions on what happens in case of C undefined behaviors.

> Ghidra's type system lacks function pointer types

Weird limitation, we support those.

> it doesn't seem to understand stack slot reuse

That's a tricky one. We're now re-designing certain parts of the pipeline to enable LLVM to promote stack accesses to SSA values, which basically solves the stack slot reuse. This is probably one of the most important features experienced reversers ask for.

> that language need not be C--

Making up your own language is temptation one should resist.

Anyway, we're rewriting our backend using an MLIR dialect (we call it clift) which targets C but should be good enough to emit something "similar to C but slightly different". It might make sense to have a different backend there. But a "standard C" backend has to be the first use case.

We thought about emitting C++, it would make our life simpler. But I think targeting non-C as the first and foremost backend would be a mistake.

Also, a Python backend would be cool.

> Analysis necessarily involves...

I would be interested in discussing more what exactly you mean here. Why don't you join our discord server?

> I'd absolutely love to be able to import C++ header files to fill in known structure types

We have a project for importing from header files. Basically we want use a compiler to turn them into DWARF debug symbols and then import those. Not too hard.