Hacker News new | ask | show | jobs
by skissane 499 days ago
I think Ghidra could do better even without any LLM involved. Ghidra will define local variables like this:

SERVICE_TABLE_ENTRY* local_5c;

I wish it at least did something like:

SERVICE_TABLE_ENTRY* local_5c_pServiceTableEntry;

Oh yeah, there’s probably some plugin or Python script to do this. But I just dabble with Ghidra in my spare time

It would be great if it tracked the origin of a variable/parameter name, and could show them in a different colour (or some other visual distinction) based on their origin. That way you could easily distinguish “name manually assigned by analyst” (probably correct) vs “name picked by some LLM” (much more tentative, could easily be a hallucination)

1 comments

In my view one of the most pressing shortcomings of Ghidra is that it can't understand the lifetimes of multiple variables with overlapping stack addresses: https://github.com/NationalSecurityAgency/ghidra/issues/975

Ghidra does have an extensive scripting API, and I've used LLMs to help me write scripts to do bulk changes like you've described. But you would have to think about how you would ensure the name suffix is synchronized as you retype variables during your analysis.

Yeah, I don't know why they don't use something like SSA – make every line of code which performs an assignment create a new local variable.

Although I suppose when decompiling to C, you need to translate it out of SSA form when you encounter loops or backwards control flow.