Hacker News new | ask | show | jobs
by rogerbinns 3856 days ago
Alternate parts to the main file are present on several platforms, and have the same problems. They are called resource forks on Mac.

On Windows/NTFS they are called alternate data streams - https://en.wikipedia.org/wiki/Fork_(file_system) - to use specify a colon and name after the filename (eg example.txt:myads).

On Unix, Linux, OS/2 etc you can find extended attributes - https://en.wikipedia.org/wiki/Extended_file_attributes - which allow storing key value pairs on a file. Restrictions exist and vary.

As for an example of them being helpful - on Windows when you download a file from the Internet using a browser an extended attribute is used to mark that. Trying to execute the file from Explorer then explains that it was downloaded and asks if you really want to proceed.

On Linux selinux can store labels in the extended attributes.

Older ignorant tools aren't going to know about this, but don't substantially harm anything. Modern tools do know about them and do the right thing. (eg copying a downloaded file elsewhere on Windows will still give the warning). The Linux GNU cp command does require a --preserve xattr flag to copy extended attributes and does not do so by default. Dropbox does support them by default and cross platform.

6 comments

Apple added support for extended attributes to HFS+ in Mac OS X 10.4 Tiger:

http://arstechnica.com/apple/2005/04/macosx-10-4/7/#extended...

Today, resource forks are actually exposed through the extended attributes interface:

http://arstechnica.com/apple/2013/10/os-x-10-9/9/#tags-imple...

You can see the source code for all this in Apple's Darwin open source repository. Example: http://opensource.apple.com/source/xnu/xnu-2782.40.9/bsd/hfs...

Please read this, written by the creator of resource forks, to learn why they exist:

http://www.folklore.org/StoryView.py?story=The_Grand_Unified...

John, Have you heard any rumblings from within Apple about a new filesystem? Is there any hope?
There's always hope.
Ding.
Conflating extattrs with HFS style resource forks doesn't really make sense.

Extended attributes are just that... attributes. They are not the main "data" of the file. Each and every unix file system has some attributes to begin with (permissions, timestamps etc.) and they vary from filesystem to filesystem (e.g., setuid bit, immutable bit etc.). User supplied attributes just extends this concept in a natural way. All the standard tools expect to be able to call open() on the file and start reading a stream of bytes assuming there's sufficient access. That's what a "file" is.

Moreover, it's not just unix as an isolated systems. Those expectations are baked into the structure of the entire Internet. When you receiving a "file" over any medium like email, web etc., you're expecting to receive the aforementioned stream of bytes. The attributes (or extended attributes) are not expected to accompany the file data as a general case.

Resource forks on the other hand just completely work against reasonable user expectations. The example given in the OP's post is one such instance. A font file that shows up as having zero bytes to every tool that works with files including tools that expect to transfer "files" over the internet. It's just broken by design.

Just a datapoint: Macs have been sending files via the Internet with resource forks for a long time, basically since the invention of the Macintosh: https://en.wikipedia.org/wiki/BinHex

Resource forks are basically legacy from the original MacOS, and something that's being retained for compatibility, not something that's really a current design. The current replacement of resource forks is bundles, where a directory masquerades as a file in the GUI.

There are lots of reasons to hate on HFS+, but I wouldn't consider this the most important one.

It's not broken, it's just old. It's a legacy technology from the original Macintosh system released in 1984 - before it was even called "Mac OS". Your "reasonable user expectations" are conditioned by three decades of experience with systems that largely didn't exist when the Macintosh resource fork was designed.
Yes the filesystem is from the 80's and from a period where Unix filesystem model had perhaps not gained the dominance that it did in later years. The question is why is it still being used in OSX 15 years after a complete redesign of the system around a Unix kernel?
They had to support HFS+ in order to convince existing users to upgrade. The first five releases of OS X had something called "Classic mode" which was basically an instance of OS 9 running as a process, so that people could continue using their pre-OS X apps. It took six years before enough developers had upgraded that Apple could stop supporting Classic. And Classic, of course, was all about HFS+. So Apple could either backport this hypothetical unix-style filesystem to OS 9, do something kludgerous with single-machine file-sharing, require all users to partition their drives and keep everything they wanted to use in Classic on the HFS+ partition (a tough sell!), or just use HFS+ and get on with it. They wanted to stay afloat, so they stuck with HFS+, and here we are today.
You used to be able to install Mac OS X on UFS volumes but unfortunately they removed that option in later releases :(
BTW, one of the first very popular vulnerabilities of IIS involved making it serve a file's source code rather than executing it (a very convenient way to get database login credentials) through a ::$DATA appended to the URL.
Arguably, NTFS implemented alternate data streams primarily to support Mac resource forks.
Nope. It was originally done as part of OS/2 (Windows NT was sort of OS/3 - the next version of OS/2) and OS/2 had "extended attributes" - https://en.wikipedia.org/wiki/Extended_file_attributes#OS.2F... - so NT also needed them for compatibility.
Thank you! All these Mac people think Apple Invented Everything.
The resource forks of the Macintosh File System date back to 1984, significantly antedating OS/2.

https://en.wikipedia.org/wiki/Macintosh_File_System

To be fair, resource forks have been around since HFS first came out. I doubt Apple invented this concept, but they definitely predate MS on this front.
Dave Cutler's team writing NT came from VMS which has multi-version files which are pretty much the same as multi-fork files.
Not the same thing at all. On VMS, when you opened a file, made changes and saved them, you automatically got a new version. You could open any version using the colon syntax, but it's not the same thing at all as resource forks.
>As for an example of them being helpful - on Windows when you download a file from the Internet using a browser an extended attribute is used to mark that.

One maybe lesser-known use for ADS: FlylinkDC++ (and derivatives) have an option to store TTH hash data in the file's own ADS, instead of in a central hash store. It means that the hash data could be used by multiple applications, but, it's less I/O efficient to make thousands of these <4KB blocks everywhere.

You wouldn't put the entire content of a file in an extended attribute, would you?
On Mac Classic, if you wanted to ship a custom font with your application, you'd put it in the Resource fork in a resource of "FONT" type.

So why wouldn't a stand-alone font file consist of a Resource fork with a single resource of "FONT" type? Otherwise, the OS engineers have to develop two entirely different ways of reading in font data. Why duplicate the effort?

The system made perfect sense, both then and now. It bothers me that so few people know anything about Mac Classic, it really was an amazingly well-designed OS for its time.

> an amazingly well-designed OS for its time

It had a few interesting ideas. But no desktop OS based on cooperative multitasking can be called 'well-designed', almost anything could hard-lock the entire system at any time.

AmigaOS was a lot better designed and had full preemptive multitasking, etc. Too bad Commodore sucked at marketing.
Considering it was built for a 68000, I believe that was a reasonable decision on Apple's part.
>On Mac Classic, if you wanted to ship a custom font with your application, you'd put it in the Resource fork in a resource of "FONT" type.

On Windows you can do the same thing with .rsrc in an .exe/.dll file.

It's possible you would ship a font as a PE .dll with a font resource, but for a single file the encapsulation is a little unnecessary (unless your application architecture already makes it easy to deal with PE encapsulation).

Except that in the windows example you gave the .rsrc section is part of the main data stream. It is counted in the file's size when doing a dir command. When the file is copied by all command line and shell tools or sent over the internet the data in the .rsrc section is copied/sent just like all the other data in the exe/dll.

None of this is true for the resource fork in the Mac case.

Had you been there, you might think different.
It does not strike me as a sane design. You already have file extensions, unique prefixes (like those used by the 'file' command to identify files). Why would you use fs forks to differentiate between file types?
Mac OS Classic had no file extensions. I remember they felt like such a hack when switching to OS X, compared to the flexibility of resource forks.
Actually, Classic Mac OS used type and creator codes, which were attributes on files, to identify filetypes, not resources.
That makes more sense.
There are various arbitrary restrictions in various filesystems and operating systems. The intention is to store smaller amounts of useful information with the file. Smaller means in the bytes/kilobytes range. For example you wouldn't store multi-gigabyte alternate language versions of a video file in its extended attributes. But you could store URLs of where to get them.

Or you could attach authorship, review, dates, keywords and similar metadata which would work for any file type, not just those whose format explicitly has that support.

If you want to get a handle on what filesystem design is like then I highly recommend Dominic Giampaolo's book on the design and implementation of the Be filesystem (he wrote BeFS). The book is freely available from his website as http://www.nobius.org/~dbg/practical-file-system-design.pdf and includes information about the design of other filesystems too. It isn't exhaustive, but does give a very good grounding in filesystems and does cover extended attributes.

Thanks for the recommendation, that book looks interesting.
Fonts, when part of a program, were stored in a "FONT" fork (conversely, icons went into "ICON"). It's natural that font files would have nothing in the data fork and the font definition itself would be under FONT.

Full documentation is available at https://developer.apple.com/legacy/library/documentation/mac...

In Classic Mac OS, most executables consisted entirely of resource forks. Even the code was stored in resources of type CODE.
PowerPC executables stored code in the data fork.
Ah, good to know. My last Mac was m68k (Performa 630CD, one of the very last m68k models), so my memories include seeing plenty of CODE resources when playing around with ResEdit.
Oh they stuck around, of course, an entire other chapter of historical MacOS zaniness in which many programs and the OS itself contained and ran both 68k and PPC code at the same time, there were 'universal procedure pointers', 'accelerated' CODE resources (with PPC code in them) and on and on.
Is that a problem with the fs and its features/implementation or a problem with how some application/developer chose to use those features?
Actually, I believe OS X does this for small files in order to not burn a whole filesystem block on a sub 1k file.

"In Mac OS X Snow Leopard 10.6, HFS+ compression was added. In open source and some other areas this is referred to as AppleFSCompression. Compressed data may be stored in either an extended attribute or the resource fork.[13] When using non-Apple APIs, AppleFSCompression is not always completely transparent."

https://en.wikipedia.org/wiki/HFS_Plus#History

FS compression is transparent to any regular file-reading API; it only shows up in lower-level APIs, like if you want to copy the file and keep the compression.
I was responding to the comment "You wouldn't put the entire content of a file in an extended attribute, would you?" HFS+ does, although in this case, it does without leaking that detail to callers.