Hacker News new | ask | show | jobs
by dboreham 24 days ago
The microcode is in a ROM. It's a regular structure where a 1 looks different to a 0.
1 comments

Yes, literally this. No verilog decode, just looking for signals in the image of a 1 vs. a 0. For example, a 1 may be the existence of a transistor at a particular intersection of wiring.
Right. And the best way to think about microcode is as code for a wacky, custom VLIW processor that implements the programmer-level x86 (in this case) instruction set. Various fields in the microcode send signals to different parts of the processor to activate them, routing values along internal busses and between registers, functional units and memory to cause the processor to execute the x86 instructions.
So what you actually need is a program that navigates through the huge image of the die and detects if the structure that is looking at is a 1 or a 0? This at the fundamental level is a cross between machine learning and image processing?
I helped out on this image-to-bits transcription, doing manual verification of the automated work. I did the whole thing by hand: I sliced the ROM images into strips that excluded parts of the image that don't encode bits, used my tablet and stylus to manually place a black dot on every 1 bit, then wrote a trivial program that detected the presence or absence of the black dot in each cell. From my perspective, the ROM is organized like a series of "ladders" where the 1 bits are missing legs of the ladder, and I was placing dots on the missing legs. I compared my results with the ML output and manually re-checked each bit where we disagreed.

http://brianluft.com/images/2026/05/386_microcode_bits.jpg -- my fully annotated result. I was working from a higher-quality PNG; this is highly compressed because it's a big image.

Thank you so much for your work. Thank you!

I wanted to give HN a perspective on working on this stuff: Working on these micrographs is like looking for a penny on 4 football fields: I tried to see how long it would take me to search the physical area for any coins, and it took 4 1/2 hours and I did not find a penny, but I found two dimes.

This is maddening work, and again, thanks.

Yes, exactly. Historically you would make some simple image processing software that will align the grid and then look for properties at each specific bit position. Usually die shots are highly imperfect (the delayering usually leaves some artifacts or damage) so frequently merging multiple scans is important as well. Travis Goodspeed has a neat tool for this workflow at https://github.com/travisgoodspeed/maskromtool and the blog mentions John McMaster’s bitract: https://github.com/SiliconAnalysis/bitract although I think most people working on these projects usually just one-off it as the mentioned Discord users in the blog post eventually did.

More modern devices are of course more difficult due to layers, feature size, and less visually obvious ROM bit designs.

Anyway, the impressive part of this project was really understanding the undocumented microcode assembly language through inference and trace following; the 1s and 0s look like they were the easy part!

The full workflow seems to look something like this, with the added complications relative to the 8086 microcode being that the 80386 microcode acts as an orchestration layer on top of hardwired engines, programmable logic arrays, and fault/protection redirection. The 8086 microcode does all that algorithmically, reusing the same hardware instead of having dedicated transistors.

1. Extract the ROM bits. 2. Determine physical-to-logical bit ordering. 3. Identify microinstruction boundaries. 4. Infer field boundaries. 5. Associate fields with hardware destinations (check with die tracing). 6. Decode instruction-dispatch programmable logic arrays. 7. Associate x86 instructions with microcode entry points. 8. Infer repeated idioms: moves, ALU ops, termination, calls, tests, redirects. 9. Decode accelerator protocols. 10. Validate against known architectural behavior.

Keep in mind, that this was Intel's flagship processor, From October 1985, until April of 1998, and they had tried to eliminate all the second sourcing. It wasn't until 1989, that the Am386 was released, and out came all the lawyers.

They were using the 6th and 7th bytes of the GDT/LDT, which were reserved, and since it affected protected mode, and virtual mode addressing, was likely stored in the microcode. Which affected Xenix, and pissed off Intel enough, that they fixed their version of Xenix, and no one else's, SCO did a rewrite and charged $500 for the privilege of running a multi-user OS.

Add to #8, the new addressing modes, the new protected modes, which affected ALU OPS, Moves, calls, redirects, and indirects.

#7, the Microcode entry points are linked directly to the instruction decode logic, and of course not limited to the great LOADALL instruction, and the new multi-stage instruction pipeline, and prefetch.

This took years for AMD to blackbox the 386, and then:

"1987–1992: The arbitration proceeding, originally expected to take only six weeks, dragged on for nearly five years."

https://en.wikipedia.org/wiki/List_of_discontinued_x86_instr....

"The 80386 microcode was successfully extracted and publicly disassembled by a team of hardware historians and demoscene researchers (including reenigne, Ken Shirriff, and others). They extracted the 94,720-bit microcode ROM from 80386 die shots by combining image processing, neural networks, and human-aided automation. AI tools played a crucial role in cleaning up the die images, detecting cell patterns, and binarizing the data before humans parsed the 37-bit microinstruction formats. You can read about the full process on the Reenigne Blog.By contrast, the 8086 microcode was extracted through purely human-driven analysis of die photos. The 8086's 21-bit microcode is simpler and was fully reverse-engineered and disassembled in 2020. You can explore the decoded 8086 microinstructions interactively using the nand2mario 8086 Microcode Browser.8086 Microcode Browser - Small Things Retro - nand2marioDec 4, 2025 — Since releasing 486Tang, I've been working on recreating the 8086 with a design that stays as faithful as possible to the original...GitHub80386 Microcode Disassembled - Reenigne blogMay 23, 2026 — Well, they may have taken that as a bit of a challenge - they threw various bits of image processing, neural networks, and human-a...www.reenigne.org80386 microcode disassembled « Reenigne blog - daily.devMay 23, 2026 — 80386 microcode disassembled « Reenigne blog. A detailed account of disassembling the Intel 80386 microcode ROM, a 94720-bit blob ...daily.devi386 - WikipediaMicrocode reverse engineering In May 2026, the Intel 80386 microcode was reverse engineered and publicly disassembled by a group i...Wikipedia8086 microcode disassembled - Reenigne blogSep 3, 2020 — Recently I realised that, as part of his 8086 reverse-engineering series, Ken Shirriff had posted online a high resolution photogr...www.reenigne.orgThe 386 microcode has been fully reverse engineered - Reddit May 24, 2026 — In a group effort, a bunch of demoscene legends like reenigne have reverse engineered the microcode for the 80386, opening the pat...Reddit·r/thisweekinretro80386 microcode disassembled « Reenigne blog | daily.devMay 23, 2026 — 80386 microcode disassembled « Reenigne blog. A detailed account of disassembling the Intel 80386 microcode ROM, a 94720-bit blob ...daily.dev"