Hacker News new | ask | show | jobs
by satoshinm 1846 days ago
These undocumented 6502 opcodes were very useful during the 90s, when developing Game Genie codes for the Nintendo Entertainment System.

GG codes were essentially[1] single-byte ROM patches. For example, the code AAAAA would patch address 0x8000 to 0x00 (I have a web-based decoder (and Node.js module) for those interested [2])

The Galoob Game Genie device only let you enter 3 codes at a time (unless you chained together multiple Game Genies), so minimizing the number of bytes you have to change to make the code work was a worthwhile goal. If your effect requires 3 codes, a typical player will not be able to use any other code. Modifying only one byte is ideal.

How do unofficial opcodes help here? Well, the official opcode list is quite limited. Undocumented/illegal opcodes help in at least two ways:

Multi-byte no-operations: want to remove an instruction completely? You can NOP the first byte, the opcode, but this leaves the operands (if there are any), desynchronizing the instruction stream (sometimes you can get away with this and it produces interesting effects, however). Fine, so you can add one or two more codes NOP'ing out the operands, but then your effect is 2 or 3 codes long!

Undocumented codes to the rescue. DOP and TOP, double-nop and triple-nop, turn this 2 or 3 code effect into only 1 code. Although to be fair, there is another workaround: patch to a legitimate opcode, which has an operand of the same size (same or similar "addressing mode"), but an effect which does not meaningfully impact the program operation. Frequently this was possible, with some effort, but DOP and TOP is nonetheless a cleaner solution.

Another benefit is combining multiple operations into one, also for a more compact code. Many of the undefined opcodes exist, as I understand it, due to "don't cares" in minimizing the instruction decoding logic, hence they combine multiple operations, sometimes usefully so (sometimes not):

ASR = AND + LSR

ANC = AND + ASL

XXA = AND + AND

ARR = AND + ROR

DCP = DEC + CMP

ISC = ISB + INS

LAS = LDA + TSX

LAX = LDA + LDX

LXA = STX + AND

RLA = ROL + AND

RRA = ROR + ADC

SAX = AND + AND

SBX = CMP + DEX

SHA = AND + AND

SHX = AND + AND

SHY = AND + AND

SLO = ASL + ORA

SRE = LSR + EOR

TAS = AND + AND

etc

The multiple-store (Sxx) and load (Lxx) operations most useful for this purpose, from my recollection. I think the most I used was LAX, which would load the operand from memory into both registers A and X, in one opcode. The combined store and arithmetic opcodes in theory could be useful though in practice many were unstable.

[1] Technically, GG codes patch the lower 15 bits of _CPU_ addresses, which can map to multiple ROM addresses depending on the mapper hardware in the cartridge. But this effect is difficult to usefully harness (though it is possible) to patch multiple ROM addresses you want because they have to line up exactly, in practice on these games the coder will use an 8-letter code to restrict the patching to one ROM bank (example: the famous SLXPLOVS = 1123?de:bd, meaning "if CPU address 0x9123 sends data 0xde, replace with 0xbd - replacing the DEC opcode with LDA, disabling decrementing the player life count variable when they die, resulting in the "Infinite Lives" effect on Super Mario Bros. 3. Galoob codemasters didn't use undocumented opcodes in their official GG codes as far as I know, but they were useful in homemade amateur codes.)

[2] https://satoshinm.github.io/nescode/