| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ajenner 2118 days ago
	Author here if anyone has any questions.

7 comments

userbinator 2118 days ago

Does the microcode give any hints on why the general PUSH and POP are in completely different places in the opcode map (push is FF/6, pop is in its own group in 8F/0 with 8F/1-7 invalid, while FF/7 is unused)? It almost looks like FF/7 was supposed to be the pop. I've always wondered what 8F/1-7 and FF/7 do on an 8086/8 too, but it's very hard to find that information.

schoen 2118 days ago

What's "random logic"? From context, it sounds like circuitry that explicitly implements the functionality of an opcode, as opposed to circuitry that can be used by the microcode, or something?

ajenner 2118 days ago

Yes, exactly - the logic that implements the simpler instructions directly as special-purpose gates rather than microcode.

kens 2118 days ago

To expand on that, "random logic" means that it looks random; it's not actually random. This is in contrast to circuits that have an underlying structure to them, like a PLA or ROM.

derefr 2118 days ago

> While most of the unused parts of the ROM (64 instructions) are filled with zeroes, there are a few parts which aren't. The following instructions appear right at the end of the ROM [...]

Given that they're right at the end — and seemingly intentionally written there after the rest of the unused space before them was zeroed — might those bytes be a checksum of the ROM?

ajenner 2118 days ago

I don't think there's anything on the chip that could compute a checksum of the microcode ROM contents. It could be some kind of copyright message perhaps, though I don't know how it's encoded and it's only 42 bits long so there isn't much space for anything meaningful.

derefr 2118 days ago

I would guess that it’s not a runtime-verified checksum, but rather a simple embedded “sum complement” value, used for ROM-mastering-time integrity verification.

A sum-complement value is a value computed from some data, such that, when the data is checksummed with the sum-complement value now embedded into it, the data will sum to zero. This approach to checksumming is useful, as any potential verifier just has to throw the image-as-a-whole through the checksumming algorithm, and ensure that the output is zero. It doesn’t need one iota of knowledge about what it’s verifying. It doesn’t even need an extra machine-register to hold the expected checksum.

These “blind” checksums allow ROM production hardware (programmers, copiers) to both pre-verify the integrity of the input image, and to post-verify that it has programmed the image onto a chip successfully. No special container format for the ROM image is required, nor is the ROM image required to be structured in any particular way (which is good, because ROMs are used for all sorts of things, not just code.) The ROM image can be any opaque blob, just as long as it sums to zero.

In fact, you don’t even need a ROM “image” at all. It’s possible to integrity-verify a programmed ROM “against itself”; and thus, a hand-programmed ROM (e.g. an EEPROM you programmed in your office) can be sent to the duplication facility to serve as the reference from which mask-ROM masks will be generated. The data on the EEPROM can be trusted, because it sums to zero. And the mask ROMs themselves can be checked for flaws by seeing whether they sum to zero.

For smaller-scale ROM distribution, ROM-to-PROM bulk copiers are used. These copiers can be made to both pre-verify the source, and to post-verify the programmed copies. Using this approach to checksumming, the copier can avoid having to verify the source “against” the destination, instead only needing to verify the source once, and then verify the destinations against themselves. This both speeds up verification; and allows for the use of simpler microcontrollers in these copiers, which reduces their design cost. (By quite a lot, back in the 1970s, when all this was most relevant.)

You can see this approach to checksumming in practice in early-generation game cartridge ROMs, which almost always have these embedded sum-complement values (and so presumably were integrity-verified during mastering/duplication.) These sum-complement value fields get referred to by emulators as “the checksum” of the ROM image—but technically, they’re not; if you’re following along, you’ll realize that “the checksum” of such ROM images is zero! :)

bogomipz 2117 days ago

"In fact, you don’t even need a ROM “image” at all."

What exactly is a ROM image? Is it just the ROM contents encoded in some defined file format? If so what would a common format be.

derefr 2117 days ago

I was being kind of loose with terminology; technically, a “ROM image” is an image (i.e. a replica, like a disk image) of a ROM chip.

ROM is random-access for reads—it’s “memory” in the same sense that RAM is memory, wiring onto a device’s address bus and so becoming part of that device’s physical memory layout.

So when people say that a game-cartridge backup device or the like captures a “ROM image”, what they really mean is that it captures “a snapshot of what the mapped region of the address space that the ROM chip claims to map for — or seems to be wired to — looks like.” Sometimes there’s metadata in the ROM itself saying what region the ROM maps for. But since the ROM is just a physical chip sitting on the bus, it can map or not map for any address arbitrarily (as long as it has the correct address lines wired to discriminate that address from other addresses.)

This is what results in so-called “overdumps” — this is where a ROM chip doesn’t actually respond to all the read requests that its mapping claims it does, and thus, for some reads (usually the ones at the top end of the ROM’s address space) you don’t get a response from the ROM, leaving the data bus floating (“open bus”), giving you undefined data for those reads.

This is why I say that a ROM image is technically an image of the address space a ROM occupies as discovered by requesting those addresses, and not an image of the ROM’s contents per se: most ROM images are, in fact, overdumps. It’s just that more modern systems have pull-up resistors on the data bus to ensure that reads the ROM doesn’t deign to respond to, read off as zero.

ROM copiers are really “ROM image” copiers — they work by programming the destination ROM(s) with the data discovered by probing the source ROM’s address space, as above. If the destination ROM is larger than the source ROM, the destination ROM will record an overdump of the source ROM.

All that being said, when originally programming an EEPROM, the ROM-programming device doesn’t actually interface to your computer as writable random-access memory. It interfaces as, essentially, a hybrid serial/block device — i.e. a device where you can either write (program) one byte to an arbitrary address, or write (program) a whole ROM-block (usually 64 bytes) at a time. You can also erase an entire block.

In other words, functionally, an EEPROM accessed through a programming device acts very similarly to flash memory accessed through a flash controller. (Flash memory is, in essence, an EEPROM technology with very fast writes trading off against slower, block-at-a-time reads rather than bus-speed byte-at-a-time reads.)

What that means, in practice, is that there’s no particular constraint on how you first program the data into the EEPROM you’re going to be mastering PROMs with. There’s no “ROM programmer file format”, any more than there’s a common file format used to descriptively represent the instructions the various mkfs(8) utils use to initialize filesystems onto a block device. Programming EEPROMs is a procedure, not data per se.

That being said, if we wanted to represent the process of programming an EEPROM using modern file formats, a CUE sheet (or equivalent) would probably be the best approach. A CUE sheet isn’t a description of the intended result, but rather a sequence of instructions for an abstract “burner” to go through to produce a result. Unlike a ROM image, which just tells you what you got when you tried to read from the addresses in an assumed-mapped memory region, a CUE sheet tells you what some other device originally tried to put at those addresses, and so lets you figure out which reads are “true” answers from the ROM, vs “open bus” answers, vs. de-facto responses from a pull-up resistor. (It also lets you emulate the process of cell wear, and so figure out which cells were intentionally “programmed to death”, allowing a faithful representation of “indeterminate state” addresses, much like the Applesauce image format[1] does for magnetic-flux media.)

[1] https://wiki.reactivemicro.com/Applesauce#Applesauce_Image_F...

So, to be clear, there's no defined file format for ROMs generally. You know the size of the EEPROM chip sitting in the programmer; you have some data you'd like to write (maybe in a file; maybe as a stream); as long as the size of the data is less than the size of the chip, you can just dd(1) the data, blockwise, onto the programmer block-device, and you'll get a programmed EEPROM.

But if you want to make this friendly to consumers — say, if the EEPROM is your computer's BIOS ROM — then you take a ROM image you've constructed some other way; wrap it in your own format with checksums et al; create a "flasher" program that first verifies the integrity of the ROM image against the checksum, and then dd(1)s it to the EEPROM programmer block-device. Usually the file extension OEMs decided on for these ROM-in-container files was ".bin". Doesn't mean anything; they were arbitrary formats, or sometimes not formats at all, just raw ROM images.

bogomipz 2116 days ago

Thanks for the wonderfully detailed reply. I had a follow up question does the ROM designer or any part of the ROM itself ever have to know where in memory it is mapped to?

mkup 2118 days ago

Does MUL/IMUL/IDIV result negation trick (via REP prefix) work on later 8086-compatible Intel CPUs (e.g. 80286, 80386 etc)?

ajenner 2118 days ago

I have just learned from dreNorteR on VCF that it no effect on a 286 but has a different, unexpected, and useful effect on a 186! http://www.vcfed.org/forum/showthread.php?76657-8088-8086-mi...

dm319 2118 days ago

I wonder if you could do a version of this article for a lay person like me? I really enjoyed Ken's articles because it assumed very little knowledge.

Akababa 2116 days ago

If ROM space is so valuable, why is ASCII text (which I imagine is relatively large) stored on it?

soufron 2118 days ago

Yup. Who did write the microcode at the time? And for how long?

ajenner 2118 days ago

According to https://en.wikipedia.org/wiki/Intel_8086: "The architecture was defined by Stephen P. Morse with some help and assistance by Bruce Ravenel (the architect of the 8087) in refining the final revisions. Logic designer Jim McKevitt and John Bayliss were the lead engineers of the hardware-level development team and Bill Pohlman the manager for the project." I expect the microcode was developed in tandem with the rest of the chip, so probably took about 2 years.