Hacker News new | ask | show | jobs
by pedrocx486 2383 days ago
Is there a way to do the reverse?

There's a quite "legendary" Game Boy Advance game out there (Klonoa - Densetsu no Star Medal) that never got a translation to English because it has some sort of in-house created compression by Namco applied to the game that was made so it could fit into a GBA cartridge. AFAIK no one was ever able to crack it open and release the code to de/compress it.

A while ago I had a "bounty" of USD100 for anyone that could do it (just the decompression and re-compression, not translation) but there aren't many people that want to fiddle with low-level GBA coding.

6 comments

Anyone who does malware analysis professionally has to deal with packed data in new and funky ways, as malware binaries tend to be packed (ie compressed) in unique ways to get past antivirus software.

If the bounty was high enough there are people out there who do this sort of thing professionally and would probably jump on the opportunity.

For something like an obscure GBA game, you can probably write up some blogposts to add reputation to the minuscule monetary value of the bounty.
Well, the bounty nowadays is $0. :-)

I have given up until I have the proper time along with a friend to try to crack the game insides in our own. If anyone did this for now, it'd be for the community.

Instead of trying to crack the compression algorithm directly, it would make sense to disassemble the machine code and try to understand what it does.
I'm assuming the game is playable, i.e. the decompression code is included on the cartridge and you just don't know how it works. In that case you could emulate the game and use a language model to identify strings containing Japanese text (you'd need to know the encoding to do that) so they can be extracted for translation. That doesn't allow you to put the translations into the compressed code, but you might be able to instrument the emulator to inject translated strings on-the-fly.
You might as well write a tool that extracts strings from a video signal using OCR, and translates them. That would make the solution more universal, and you could even use it to e.g. suppress ads.
Well that's super hard since Japanese encoding is an epic story in digital archaeology itself.
I'm not well-versed on the subject, how does encoding come into play for text displayed on the screen? Did they use a strange way of representing the Japanese text because of technical limitations?
The GBA didn't have much RAM. There is a good chance tiny chunks of the game get decompressed as needed, and there is never a time when the whole thing is decompressed at once and can be dumped.
You don't need everything to be decompressed all at once to dump it. You could continuously dump the memory contents while exploring the game (or having a fuzzer do it for you).
It might end up smarter to RE the decompression and then patch the game to accept an uncompressed translation (and bump up the size of the rom from 16mb to 32mb)
You should be able to do it with enough sample data, but it might not be perfect since you'd still be guessing at what it's really doing.
Doesn't it depend on whether the algorithm is lossy? If it's lossy (not bijective) it's impossible to invert the function
This is discussing text or code compression. There's no point in using lossy methods for the dialogue of your game.