Hacker News new | ask | show | jobs
by pronoiac 506 days ago
Is XFS well-regarded by others?

I was benchmarking filesystems by generating a billion empty files on them, and while ext2, ext4, and btrfs could finish in a day or two, xfs hit a wall in the first 4 million files, and was on track to take weeks. After hitting ctrl-c, it took 20+ minutes to get a responsive shell, and unmounting took a few minutes.

This surprised me, because it's been around for decades, and I expected scalability, performance, and robustness. A one-user accidental DOS was a surprising rake to trip over.

2 comments

XFS is well regarded. It is kind of a second-generation of UINX filesystem that was designed from ground up with journaling and extents and btrees and xattrs.

EXT4 is well regarded too, it has similar goodies now but it evolved from a first-generation stile without journaling and with block (not extent) tracking, primitive kinds of trees implemented as N-indirect arrays, etc. Whether that really matters is debatable, these days I think it's well acknowledged that you have to evolve and adapt to new features and clean redesigns aren't necessarily better (XFS has gone through a bunch of new changes too, e.g., metadata checksums).

XFS came from SGI IRIX and there it was really focused on datapath throughput and scalability (think, HPC systems or graphics clusters working through enormous source and result data files). And they were much less focused on metadata performance.

XFS certainly was much slower and more heavyweight than EXT doing things like creating files, but there has been a lot of work on that over many years and it's a lot better than it was. That said, a billion files is a fairly extreme corner and it's not necessarily what's required for most users.

This paper I found is similar, although probably has many differences from exactly what you were doing:

https://arxiv.org/html/2408.01805v1

XFS creation performance was around par with EXT4 up to 100M files, but took 2x as long to create 1B. Maybe yours did much worse because you aren't splitting files into subdirectories but creating them all in one? That could suggest some XFS structure has a higher order of complexity and is starting to blow out a bit, directory index for example.

XFS often comes out pretty high up there on other benchmarks

https://www.phoronix.com/review/linux-611-filesystems/3

So as always, the better performing one is going to depend on what your exact workload is, especially if you are doing something unusual.

There's not really any reason to use ext4 over xfs anymore, unless you need fscrypt or shrinking.

Reflink support is super useful.

> Maybe yours did much worse because you aren't splitting files into subdirectories but creating them all in one?

No, and also, I'd expect that to be awful. 1000 folders, each with 1000 folders, each with 1000 files.

Those Arxiv and Phoronix links are great!

>https://www.phoronix.com/review/linux-611-filesystems/3

It's a shame that ZFS was not included.

Probably because it's benchmarking Linux filesystems.
I don't know what caused your experience, but I've had the opposite experience using XFS with many small files. It's my filesystem of choice for simple (single device) use cases.

The main reason is that XFS automatically allocates inode space. With ext4, I would quickly run out of the default inode quota while the volume was nowhere near full, and then manually tune the inode to block ratio to accommodate more files. XFS took care of that automatically. Performance was otherwise identical, and I've never seen any data loss bugs or even crashes from either one.

How many decades ago was that? Sounds more like a partition converted from ext3. No ext4 partition I've seen in the last 15y, didn't have a ridiculous amount of inodes. I do support for several hundred Linux systems.
Zero decades ago? I run EC2 instances that process hundreds of millions of small files. I always use the latest Ubuntu LTS.

I'm also tired of trying to share my experience and having to choose between ignoring snide ignorant low-brow dismissals, and leaving them unanswered so they can misinform people. Ext4 does not have a dynamic inode reservation adjustment feature. XFS does. So with ext4, it's possible to run out of inodes while there are blocks available. With XFS, it's not.

From this paper https://arxiv.org/html/2408.01805v1 (2024)

> EXT4 is the general-purpose filesystem for a majority of Linux distributions and gets installed as the default. However, it is unable to accommodate one billion files by default due to a small number of inodes available, typically set at 240 million files or folders, created during installation.

Which is interesting. I knew EXT2/3/4 had inode bitmaps, but I haven't been paying them much attention for the past decade. Slightly surprised they haven't added an option for dynamic allocation, OTOH inodes are small compared with storage and most people don't need billions of files.

That person is being extremely silly when they call that "small".

Ext2/3/4 reserves so many inodes by default. One per 16KB of drive space. You don't hit that with normal use. Almost everyone should be reducing their inode count so it doesn't take up 1.6% of the drive.

> That person is being extremely silly when they call that "small".

Explain.

> Ext2/3/4 reserves so many inodes by default. One per 16KB of drive space. You don't hit that with normal use. Almost everyone should be reducing their inode count so it doesn't take up 1.6% of the drive.

Well almost, but not the OP who runs out of inode space with the default format.

> Explain.

Hundreds of millions of inodes is not a small number. I'm not sure how I can explain that much better. There are multiple orders of magnitude between "240 million inodes" and "a small number of inodes".

And on a 14TB hard drive, the default would be more like 850 million.

> Well almost, but not the OP who runs out of inode space with the default format.

I said "almost" for a reason. It's a bad idea for quite small drives or some rare use cases.

I ran a very small personal webserver with limited storage on Ubuntu on EC2 for a while.

The EC2 instance, likely the smallest configuration available at the time, hit an inode limit just running updates over time.

With gentoo, if you allocate let's say 20G to / on ext4, then you can quite easily run into this issue.

/usr/src/linux will use about 30% of the space and 10% of the inodes.

/var/db/repos/gentoo will use about 4% of the space and 10% of the inodes.

Next you clone the firefox-source hg repo, which will use about 15% of the space and 80% of the inodes.

> Next you clone the firefox-source hg repo, which will use about 15% of the space and 80% of the inodes.

Looking at my mozilla checkout the source and repo average 6KB per file, which would eat lots of inodes.

But once I compile it, it's more like 20KB per file, which is just fine on default settings. So I'm not sure if the inodes are actually the limiting factor in this scenario?

And now that they're moving to git, the file count will be about 70% smaller for the same amount of data.

Based on the mke2fs git history, the default has been a 256 byte inode per 16KB of drive size since 2008, and a 128 byte inode per 8KB of drive size before that.
i had ext4 telling out of space with 52% in df lol

i just converted it inline to btrfs