| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by analogpixel 52 days ago
	I couldn't tell, is a person doing this? or was this an LLM dissecting it?

2 comments

siraben 52 days ago

This was made collaboratively by me directing coding agents at the binary, using Ghidra MCP extensively, disassembly and also dynamic analysis with an emulator. I don't have a writeup of the process but it was definitely not fully automatable (I wish though). I might prepare a blog post with transcripts and session history and things I learned along the way.

Broad takeaways:

- Ghidra MCP is not a silver bullet. Lots of opportunities for mis-decoding especially on older instruction sets (e.g. conflating code + data), which requires user input to flag data layout/structs.

- Agents still need a lot of user direction otherwise the RE production is just kind of a random walk. With Z80 it's decent at reading code but I expect that it has much worse performance than reading x86 or ARM for instance. The TI-84+ has a bunch of hardware quirks as well.

- GPT 5.5 is better than Opus 4.8 at RE. Opus 4.8 loves plausible-sounding RE'd logic without any checking. The gold standard is actually dynamically executing the binary and comparing the logic against the prose.

- Maintaining consistency in style and prose is a PITA across the wiki. Hard to reconcile prose <-> code. Can be somewhat mitigated by agent loops.

Was also in discussions with people in the TI calculator programming space who helped provide guidance as well. We previously did not have a catalogue of every subsystem in TI-OS yet alone most subroutines in the OS.

link

RgrTheShrubbr 52 days ago

Having just recently heard about Ghidra and started using it with Claude. I am absolutely blown away how little resistance it has decompiling old Win95/98 binaries. It's turning into a bit of a hobby of mine to take old software, decompile and find hidden treasures like images or messages.

link

Chu4eeno 51 days ago

There's this unfortunate common misconception (that LLMs luckily don't tend to share) that reverse engineering is illegal or immoral, when it's both a great source of learning, a necessity for things like interop/preservation, and even has explicit carve-outs in the copyright laws of many/sane countries.

I know my government has a good amount of reverse engineers on the payroll (mostly in the security services).

link

hedgehog 52 days ago

Do you have plans to generate a buildable version of the sources, and do you know the original implementation language (C?).

link

siraben 52 days ago

It's highly likely that the original implementation language was assembly. The code is very idiomatic.

Regarding source build, I think reverse engineering it to the point where you can reconstruct the source is possibly legally problematic, so I don't plan to do this, but maybe for certain subsystems like MathPrint (equation display) which was especially fun to RE. I have a PR up for it and it will be live at

https://siraben.github.io/ti84p-re/mathprint

link

ndiddy 52 days ago

Typically the approach taken by people who are concerned about legal issues regarding disassemblies is that they distribute a script file that contains all the code/data annotations, comments, variable names, and labels, and then the user can feed this file and a copy of the original binary into the disassembler to reproduce the disassembly. Here's a random example for a 6502 codebase: https://github.com/TakuikaNinja/FDS-disksys . IDA Pro has this functionality built in, you can export a .idc script file that will reproduce the .idb file if you load the original binary into a fresh instance of IDA Pro and then run the script. Maybe Ghidra has something similar, if not I bet you can get your AI to write export/import scripts for Ghidra.

link

jamesfinlayson 52 days ago

> It's highly likely that the original implementation language was assembly.

Agreed. I did a bit of development on a TI-84+ years ago and I was not a skilled programmer back then so only used TI-BASIC, but the fact you could only write apps in assembly makes me think the operating system was the same. ticalc.org had a gcc fork from memory though I don't recall which calculators it targetted.

link

analogpixel 52 days ago

how much have you spent so far on this (for tokens)?

link

siraben 52 days ago

The plans are heavily subsidized by the AI companies so I didn't end up needing to do API usage or buy another subscription. I have ChatGPT Pro and Claude Code Max.

link

xkcd-sucks 52 days ago

> Confidence is flagged: .....

> The big picture

> The structural reverse-engineering is comprehensive (every subsystem mapped, both cross-page mechanisms resolved ...

> Confidence summary / open items

Probably an LLM wrote the docs.

> (the GhidraMCP plugin reconnects for interactive work)

Probably LLM+Ghidra for the actual RevEng. Ultimately does it matter if the end product is works though

link

markus_zhang 52 days ago

I think it’s fine as long as it works. Personally I prefer doing everything manually because that’s where the fun is, but everyone has their own fun.

link