Hacker News new | ask | show | jobs
by yorwba 2383 days ago
I'm assuming the game is playable, i.e. the decompression code is included on the cartridge and you just don't know how it works. In that case you could emulate the game and use a language model to identify strings containing Japanese text (you'd need to know the encoding to do that) so they can be extracted for translation. That doesn't allow you to put the translations into the compressed code, but you might be able to instrument the emulator to inject translated strings on-the-fly.
2 comments

You might as well write a tool that extracts strings from a video signal using OCR, and translates them. That would make the solution more universal, and you could even use it to e.g. suppress ads.
Well that's super hard since Japanese encoding is an epic story in digital archaeology itself.
I'm not well-versed on the subject, how does encoding come into play for text displayed on the screen? Did they use a strange way of representing the Japanese text because of technical limitations?
The GBA didn't have much RAM. There is a good chance tiny chunks of the game get decompressed as needed, and there is never a time when the whole thing is decompressed at once and can be dumped.
You don't need everything to be decompressed all at once to dump it. You could continuously dump the memory contents while exploring the game (or having a fuzzer do it for you).