Hacker News new | ask | show | jobs
by klodolph 716 days ago
> One metadata and one the file contents.

I’d say this is not the right way to describe a resource fork. Instead, think of it as two sets of file contents—one called "data" and one called "rsrc". On-disk, they are both just bytestreams.

The catch is that you usually store a specific structure in the resource fork—smaller chunks of data indexed by 4-byte type codes and 2-byte integer IDs. Applications on the 68K normally stored everything in the resource fork. Code, menus, dialog boxes, pictures, icons, strings, and whatever else. If you copy an old Mac application to a PC or Unix system without translation, what you got was an empty file. This meant that Mac applications had to be encoded into a single stream to be sent over the network… early on, that meant BinHex .hqx or MacBinary .bin, and later on you saw Stuffit .sit archives.

That’s why these structures don’t fit into an inode—it’s like you’re trying to cram a whole goddamn file in there. The resource fork structure had internal limits that capped it at 16 MB, but you could also just treat it as a separate stream of data and make it as big as you want.

3 comments

From https://en.wikipedia.org/wiki/Resource_fork:

> While the data fork allows random access to any offset within it, access to the resource fork works like extracting structured records from a database.

So, whatever the on-disk structure, the motivation here is that from an OS API perspective, software (including the OS itself) can interact with files as one "seekable stream of bytes" (the data fork), and one "random-access key-value store where the values are seekable streams of bytes" (the resource fork).

So not quite metadata vs data, but rather "structured data" (in the sense that it's in a known format that's machine-readable as a data structure to the OS itself) and "unstructured data."

The on-disk representation was arbitrary; in theory, some version of HFS could have stored the data and resource forks contiguously in a single extent and just kept an inode property to specify the delimiting offset between the two. Or could have stored each hunk of the resource fork in its own extent, pre-offset-indexed within the inode; and just concatenated those on read / split them on write, if you used the low-level API that allows resource forks to be read/written as bytestreams.

This in mind, it's curious that we never saw an archive file format that sends the hunks within the resource fork as individual files in the archive beside the data-fork file, to allow for random access / single-file extraction of resource-fork hunks. After all, that's what we eventually got with NeXT bundle directories: all the resource-fork stuff "exploded" into a Resources/ dir inside the bundle.

> So, whatever the on-disk structure, the motivation here is that from an OS API perspective,

There are multiple layers to the OS API. There is the Resource Manager, which provides the structured view. Underneath it is the File Manager, which gives you a stream of bytes. You can use either API to access the resource fork, and there are reasons why you would use the lower-level API.

One example from the documentation was to provide a backup. For various reasons, it was possible that a resource fork could become corrupt—this is back in the day that macOS had no protected memory (for shame!), disk was slow, and we didn’t use journaling filesystems. Some programs kept around backup copies of whatever file you were working on. If your data was stored in the resource fork, well, there’s an easy way to get a backup… just open the resource fork as a stream of bytes and copy it to another place on disk. You could copy it a data fork, and some people even copied it to a data fork in the same file.

The other main reason you would use the lower-level API is because you are writing a program like MacBinary or Stuffit.

> This in mind, it's curious that we never saw an archive file format that sends the hunks within the resource fork as individual files in the archive beside the data-fork file,

Well, there are advantages and disadvantages to that approach. You can already access resources inside a resource fork inside various archive formats, like MacBinary, AppleDouble, and AppleSingle. But you probably do want to preserve the actual byte stream of the resource fork itself. (And there’s also an undocumented compression format for single resources.)

I am not old enough to know how resource forks were implemented on Mac OS but this is definitely not the case today. Resource forks are implemented (or maybe "emulated" is a better word to use? Not sure how much effort is put into them) as random-access. You can use POSIX APIs to interact with them (using _PATH_RSRCFORKSPEC) and these are typically faster than other interfaces.
Back in the day, you used the Resource Manager to open a resource fork. The resource manager provides functions to load individual resources, query which resources exist, and add or modify existing resources.

The Resource Manager made it to Mac OS X as part of Carbon. The main part of Carbon is gone, but a part of it called CarbonCore survives, and that contains the resource manager. If you dig through the docs, you can find it. It was deprecated in 10.8 (which seems really late… the writing was on the wall about resources back when 10.0 hit).

https://developer.apple.com/documentation/coreservices/carbo...

The modern resource manager functions in CarbonCore I think just use the POSIX API underneath. Undoubtedly, there’s some test suite at Apple that makes sure it works correctly. Also undoubtedly, there’s some application vendors who wrote code using resources in the 1990s and still has some of that shipping today.

In Unix, it's said that "Everything is a file" - i.e. that everything on the system that applications need to manage should either be actual files on disk or present themselves to the application as if they were files.

This adage translated to classic MacOS becomes "Everything is a resource". The Resource Manager started out as developer cope from Bruce Horn for not having access to SmallTalk anymore[0], but turned out to completely overtake the entire Macintosh Toolbox API. Packaging everything as type-coded data with standard-ish formats meant cross-cutting concerns like localization or demand paging were brokered through the Resource Manager.

All of this sounds passe today because you can just use directories and files, and have the shell present the whole application as a single object. In fact, this is what all the ex-Apple staff who moved to NeXT wound up doing, which is why OSX has directories that end in .app with a bunch of separate files instead. The reason why they couldn't do this in 1984 is very simple: the Macintosh File System (MFS) that Apple shipped had only partial folder support.

To be clear, MFS did actually have folders[1], but only one directory[2] for the entire volume. What files went in which folders was stored in a separate special file that only the Finder read. There was no Toolbox support for reading folder contents, just the master directory, so applications couldn't actually put files in folders. Not even using the Toolbox file pickers.

And this meant the "sane approach" NeXT and OSX took was actually impossible in the system they were developing. Resources needed to live somewhere, so they added a second bytestream to every file and used it to store something morally equivalent to another directory that only holds resources. The Resource Manager treats an MFS disk as a single pile of files that each holds a single pile of resources.

[0] https://www.folklore.org/The_Grand_Unified_Model.html?sort=d...

[1] As in, a filesystem object that can own other filesystem objects.

[2] As in, a list of filesystem objects. Though in MFS's case it's more like an inode table...

One of most important technical details about resources in early MacOS is that it allowed the system to swap resources by using double indirect pointers (aka handles) with the lock bit stuffed into the upper 8 bits of the 32 bit. Stealing the extra flag bits from the upper bits instead of increasing the alignment to make a few lower bits available was fine on the 68000 and 68010 with their 24 Bit address space, but exploded into your face on an 020/030 with a real 32 Bit address space. It was a nightmare do develop and debug. A mix of assembler, Pascal and C without memory protection, but at least you could use ResEdit to put insults into Menu entries on school computers.
Good 'ol purgeable resources: one of the reasons why the early Mac could get away with 128kb and lots of floppy swapping.
>> One metadata and one the file contents.

> I’d say this is not the right way to describe a resource fork. Instead, think of it as two sets of file contents—one called "data" and one called "rsrc". On-disk, they are both just bytestreams.

I think it's a perfectly fine way. You're just coming at it from a wildly different level of abstraction.

One could say yours is not the right way either and jump down into quantum fields as another level.

GP is more accurate, because "file contents" could be in either or both. Not all files had a data fork, and not all files had a resource fork. Some metadata, such icon position, was also stored independently of the file, using the hidden Desktop database.