Hacker News new | ask | show | jobs
by mort96 1385 days ago
I like the idea. Making it backwards compatible with FAT means that, in principle, regular FAT filesystem implementations could be transparently changed to support big fat files (hehe) transparently.

However, reading the spec, it doesn't look fully backwards compatible? It seems like there are file structures which are possible to represent in FAT which aren't possible to represent in BigFAT. In FAT, I could have a 4GB-128kB size file called "hello.txt", and next to it, a file called "hello.txt.000.BigFAT". A FAT filesystem will show this as intended, but a BigFAT implementation will show it as one file' "hello.txt". That makes this a breaking change.

I would kind of have hoped that they had found an unused but always-zero bit in some header which could be repurposed to identify whether a file has a continuation or not, or some other clever way of ensuring that you can represent all legal FAT32 file structures.

4 comments

There are so many good filesystems out there. Is it really necessary to keep dragging FAT along?

ReactOS is using btrfs, which has so many useful options that FAT will never see (zstd, xxhash, flash-aware options, snapshots, send/receive, etc.). This is positioned both for Linux and Windows.

Microsoft itself restrains ReFS to enterprise use, and btrfs offers so much more functionality. We should stop using a file system from the '80s.

Nothing beats the simplicity of FAT.

Btrfs has a lot of bugs while being active for a long time. This is mostly related to its complexity.

If I'm going to implement a filesystem for a custom hardware I would definitely not chose btrfs.

If you want snapshots, dedup, transparent compression, and scrubs then you have precisely three open and/or available choices: ZFS, btrfs, and ReFS.

By all means, choose the Microsoft solution, because patent licensing is good for everyone.

And the bug myth past into history years ago.

"So, we'll repeat this once more: as a single-disk filesystem, btrfs has been stable and for the most part performant for years."

https://arstechnica.com/gadgets/2021/09/examining-btrfs-linu...

None of those features you mention mean anything in the situations where FAT is still being used.

FAT is still popular because it is very easy to implement, and anything can read and write to it. It is pretty easy to implement the file system on a low power microcontroller and have it write data to an SD card. Your users can then plug that SD card into any computer and view the data, or add to it.

Using btrfs in a situation like this means a lot more coding on your end, and your users lose the convenience of the SD card using a file system they can easily interact with.

Nobody is using FAT for their primary system partition. It is almost exclusively relegated to embedded systems and small external storage devices where broad compatibility is an important feature.

But I just want 5GB files
Microsoft just wants a check from you.

We are all forced to pay for this ancient software every time we buy a device that uses it.

Wouldn't this money be better used elsewhere?

https://en.m.wikipedia.org/wiki/File_Allocation_Table#Patent...

AFAIK, FAT32 is Patent Free, even the Long File Name Patent has expired. The only patent left are on exFAT.
Microsoft isn't getting a check from me using FAT32.
You know patents have a limited life, right?

x86_64 with SSE2 is also patent-free right now, as an example.

All of those patents expired a long time ago
> Nothing beats the simplicity of FAT.

Or the sheer ubiquity, and therefore cross-device compatibility.

Camera manufacturers and SD card manufacturers can't start shipping SD cards formatted with btrfs until Windows supports it out of the box. They can start shipping SD cards formatted with FAT32 and software/firmware which reads and writes FAT32+BigFAT.
More specifically, they need a filesystem that both Windows and MacOS can read. No one wants to take their SD card to a friend's computer and have it not work for reasons they won't understand.

The shared set there is basically just fat and exfat.

If Microsoft and Apple collaborated on a new filesystem, or even just supported it, then we might have a possible successor. However even with that, the millions of already shipped devices won't support it. This during the transition period of many years there will still need to be support for fat.

That keeps fat the lowest common denominator and everything supporting it.

Remember last time they tried a universal media filesystem with UDF? It was implemented in the most incompatible ways as a token gesture by both Microsoft and Apple. These companies want their own, patented, proprietary fs so they can maintain lock-in.

The only way to get a universal standard is to have the community do it and have enough people use it that the big companies have to capitulate.

The problem is you can't get there without out-of-the-box support.
exFAT is a good candidate for a replacement "lowest common denominator" file system, and support for it is growing rapidly now that Microsoft has effectively open-sourced it.

But as you pointed out, in a transitionary period there is still a need to support older devices and software. FOSS purists may also not approve of using exFAT in some situations, since the relevant patents have not yet expired, even if MS has released them to the OIN.

Now that exFAT is "open", I've seen it cropping up much more often. SD cards often ship with it, especially large ones.
That happened before it was open. exFAT is the standard filesystem on SDXC
> can't start shipping SD cards formatted with btrfs until Windows supports it out of the box

3rd parties can write drivers for Windows, you know. A small, read-only FAT partition on a USB stick or SD card could contain the installable drivers necessary to read/write the rest of the disk.

However, that's unnecessary. The best option for a universal file system is UDF. Windows, Mac, and Linux all have full read/write support.

See: https://github.com/JElchison/format-udf

I guess what is needed is a BSD implementation of btrfs.

Still, something similar to fuse might help with the licensing.

No, what is needed is for Windows and macOS to support btrfs out of the box.
If they can ship software that reads BigFAT, why can’t they ship software that reads btrfs?
Other people have given good answers, but here's another one: People's computers can already mount BigFAT-formatted drives.

Do you know what happens when you insert a btrfs-formatted SD card or USB stick into a Windows or macOS machine? It tells you that the drive is unreadable and asks if you want to initialize it. If the user answers yes to that question, the system formats the drive and all of their data is lost.

With a BigFAT-formatted drive, the system will mount it no problem, the user will be able to browse the contents, and the only weird part is that their largest files are split into parts.

Because you only need the software for >4gb files and block level access requires root or admin usually. This is can be fully userspace if your OS already supports Fat32 (it does).
1. BTRFS is a lot more complex.

2. Switching to BTRFS would be a breaking change. BigFAT wouldn't be. You can still use the card in devices that do not support it, without needing to reformat. Those devices would just lose access to some files.

Probably simplicity. It would be easier for a manufacturer to do a quick firmware update that implements BigFAT than having them support BTRFS.
> Is it really necessary to keep dragging FAT along?

Anything involving embedded and without deep pockets has no other option, FAT (sadly) still is the least common denominator. Some speak ExFAT, but not sure how good the tooling support is outside of Microsoft, and there are still patent concerns.

I believe the exFAT patents expire or expired this year.
Even without patent concerns, exFAT takes more effort to implement (lots of features you might not need) and ends up requiring more ROM space that may or may not be available… even if you want to write files exceeding 4GB to some external media.

e.g. some widget without network connection that optionally logs an audit trail to an attached USB media, could still end up with only a couple hundred kilobytes of soldered-on ROM to store the whole firmware, while wanting to write more than 4GB of audit logs.

> ReactOS is using btrfs, which has so many useful options that FAT will never see (zstd, xxhash, flash-aware options, snapshots, send/receive, etc.). This is positioned both for Linux and Windows.

I just want to transfer files on USB sticks without worrying about file size or the OS accessing it. The infuriating part is that it is 2022 and if you want to reliably and easily move files larger than 3-4GB on removable media people tell you to use proprietary MS file systems like ExFAT And NTFS. That is unacceptable.

We NEED a simple, portable, and freely open file system spec for removable media that handles large drives and files.

FAT is for weak little cost-optimized embedded-microcontroller devices that write one file at a time to an SD card — which is something we're still building to this day, in the form of IoT devices. We don't really have any better option for this use-case; every newer filesystem is either non-portable, or assumes stronger hardware such that the overhead of using it on these devices would be huge.

I would note that one way to work around the cost-incentives of IoT manufacturers, would be to encourage them to externalize the storage-layer costs from the device's BOM, by focusing on getting "object-storage oriented" NAND flash controllers pushed down from enterprise to regular retail availability. That way, all the filesystem-layer smarts end up living in the SD card itself — which is sold separately. (It'd be sort of a second coming of the ancient Commodore 1540/1541 paradigm, where the disk controller presented not as block storage, but as, essentially, a single-user serial-attached NAS.)

BtrFS is unsafe for production use unless it's coupled with really good backups.

On the other hand, NTFS on Windows, Ext* on Linux, or ZFS on any supported OS, has not been known to eat data as frequently.

Is BtrFS without RAID safe?

According to them (https://btrfs.wiki.kernel.org/index.php/Status) only RAID56 is unstable.

There are still major bugs in the rest of it. You can trivially corrupt a mirror. There are examples for how to reproduce it exactly in qemu with virtual drives. I caused me total dataloss on my first Btrfs filesystem at least 8 years back. That bug is apparently still there. The unbalancing issues are still there. I have zero trust of Btrfs in any form.
Can you link to that? I have been using btrfs in raid1 and single for ~3-4 years now without any data loss.
You (and many others) might well be using RAID1 mirrors without problems. I did as well. But the problems here are not encountered during day-to-day usage. They are bugs in the recovery codepaths following hardware failures. I suffered this due to a transient SATA cable glitch, but the instructions let you exactly reproduce this with qemu with a recent kernel. I've not tried the qemu approach myself; I moved over to ZFS a good while back now.

I've had a hunt for the specific instructions but I'm afraid I can't find it again with a search. The gist of it was to:

- create Btrfs mirror using two qemu virtual disks

- pull the (virtual) plug on one of the pair to disconnect it, then later reconnect it

- Btrfs ends up hosing both the outdated and current copies of the mirror, leading to complete dataloss of the entire mirror

Synology uses Btrfs + mdadm RAID for their NAS boxes, which are regarded as rock-solid.

https://www.synology.com/en-global/dsm/Btrfs

Whatever voodoo they are pulling out of thin air, is clearly not present on commodity, non-tuned Btrfs filesystems. That, and most people have backups set up on their NAS.
Well, if you want, you can just pull the drives out of your Synology box and hook them up to any Linux system [1]. So whichever way they're tuning it, it can't be that voodoo-esque.

[1] https://kb.synology.com/en-global/DSM/tutorial/How_can_I_rec...

With all due respect, this is now far from true.

"So, we'll repeat this once more: as a single-disk filesystem, btrfs has been stable and for the most part performant for years."

https://arstechnica.com/gadgets/2021/09/examining-btrfs-linu...

I mostly use fat as a go between for different operating systems, things could be installed to implement similar functionality around another file system, but it's nice to have a default built into everything format that works on every machine. It has its flaws, but the universality of it is a huge strength
What FAT32 filesystem in the real world has a file named "foo.000.BigFAT" on it?
I can imagine that if bigfat is successful, such files will start to exist.

Imagine someone takes a bigfat drive and puts it in a non-bigfat capable machine, then zips up a directory and publishes it.

When that directory is unzipped on a bigfat machine, should the bigfat files be re-joined, or should they show as separate files? One breaks the OS file API and the unzip program might crash/fail, while the other leads to the application trying to create filenames which can't exist in the filesystem.

> should the bigfat files be re-joined, or should they show as separate files

They're only "rejoined" by the BigFAT compatible filesystem driver on access. By running such a driver, you're agreeing that such files should "appear" as one.

Hey, notice that you're suggesting that BigFAT should be disabled by default here; you think the user should have to choose to be running a driver with BigFAT-support. Maybe reflect on whether that's a desirable situation, or if it would've been preferable if the feature could've been enabled by default.
See my response to https://news.ycombinator.com/item?id=32753207. I'm not saying it will break everyone's FAT32 drives, but it is a breaking change in a filesystem, which seems like something kernel people would usually try to avoid.
It's as backwards compatible as any other fat extension done so far.

For example, LFN fails if you create too many files with the same first 6 letters :)

I'm actually honestly not sure why representing all legal FAT32 file structures is a particularly useful goal?

FAT in particular, in all of it's forms, has always had limitations and weirdness in filenames, etc.

I don't understand your LFN example. Which FAT file structure can be represented with LFN disabled that's no longer possible to represent if you add support for LFN?

If BigFAT was actually backwards compatible, it would've been a no-brainer to add support for in filesystem drivers. But since it changes the interpretation of some legitimate structures, adding support for BigFAT is a breaking change. I don't know whether operating systems will want to make breaking changes to their FAT32 filesystems, but it certainly seems like a bigger ask.

Err, a directory with all possible six character prefixes, differentiated at the seventh character, is representable without LFN but not with it. Wikipedia actually has links/info on it if you want more.
My favourite sobriquet for MS has long been the DOS view of an Office 97 installation in the filesystem:

MICROS~1

(Long name, Microsoft Office.)

So if you create files called MICROS~2 - MICROS~0 in theory you can create enough abbreviated names that there are not available short names for long filenames you wish to create. Every LFN must have a "real" 8.3 counterpart.

I imagine the premise is that if you mount this disk in an implementation that doesn't understand these structures it works and you don't corrupt it, making the format backwards compatible with old implementations. This is similar to the trick used to add long filenames: putting a special 8.3 file with a ~ that includes the full file name.