Hacker News new | ask | show | jobs
by wahern 2041 days ago
Your point about the dentry cache seems like a non-sequitur. The cache isn't the bottleneck; the bottleneck is that NTFS is a crappy filesystem.

Yes, I've read this comment: https://github.com/microsoft/WSL/issues/873#issuecomment-425...

The argument about Windows not having a single, central dentry cache doesn't hold water. For one thing, there's no reason to think that a two-level cache would significantly decrease performance. More importantly, the fact that on Windows most VFS work occurs in the filesystem driver itself only proves that Windows could have implemented an EXT4 driver with better dentry and other semantics and permitted WSL1 environments to keep most of its files on an EXT4 volume, which is what happens with WSL2, anyhow.

WSL1 could have been better in every way. It would have required them to track a moving target, but they already accomplished semantic parity with seemingly minimal resources. Improving performance was certainly possible, and keeping up would have required significantly less work on its own. At the end of the day, I think the reason WSL1 was canned is obvious: the decision to ship WSL1 was probably accidental. Why accidental? Because anyone with the slightest business acumen would realize that a WSL1 environment with feature and performance parity to open source Linux would mean there'd be little reason to target Windows' native environment at all for new software development. And while Microsoft has done relatively well for itself with Azure and other ventures, its revenue still principally derives from Windows and Office, and the last thing Microsoft needs is to accelerate movement away from those products.

WSL2 means that Microsoft can keep its Linux environment a second-class citizen without being blamed for it, while still providing all the convenience necessary for doing Linux server application development locally. Integration with the native Windows environment will be handicapped by design and excused as an insurmountable consequence of a VM architecture. For example, with WSL1 and some obvious and straight-forward improvements it would have been trivial to run a Linux-built Electron app with identical performance, responsiveness, and behavior (not to mention look & feel, given the web-like UI). With WSL2 Microsoft will likely only ever provide GUI integration over RDP, which will never have the same responsiveness (not because it's theoretically impossible, but because nobody would ever demand it, and in any event it would require tremendously more effort than the comparable WSL1 approach).

3 comments

> the bottleneck is that NTFS is a crappy filesystem.

Care to explain why do you think NTFS is crap?

From my experience it's usually bad assumptions about files under windows and every application or library tries to stick to posix interface when dealing with files (open, read/write, close) which tends to block for longer periods on windows than on linux counterparts which results in significant perfomance loss .

Linux first software will always outperform windows implementation and Windows first software will outperform Linux implementations unless you provide separate code paths to properly handle underlying OS architecture and assumptions.

On Windows closing the file handle is extremely costly operation due AV checks and file content indexing [1]

[1] https://www.youtube.com/watch?v=qbKGw8MQ0i8

> Care to explain why do you think NTFS is crap?

I don't know if it's crap but it's much much slower than EXT4.

I remember reading a comment here that Windows in a VM on a Linux host was faster than bare metal.

Probably not true but I decided to make a test. I have a .net core app that insert data in a sqlite db (the resulting db is about 300 GB).

So I benchmarked this app on Linux (it was previously running on Windows) and IIRC it ran about 4 times faster.

An interesting observation I did was that even Hello World is about 100x faster on Linux than on Windows.

In my Linux VM I had to use the time command to even have an idea of how long it took, as it seemed to return immediately. I think 5-15ms but it already a while ago.

On the Windows machine where the Linux VM ran it took several hundred ms.

Which VM software did you use under windows?
Probably VirtualBox, could have been Hyper-V.
so what's the 'Windows way' then? keeping a file open forever and let it lock out other programs from using that file? I'm starting to get an idea why Windows is getting on my nerves so much with file locks....
Here is the thing, UNIX is the strange dude with advisory file locks, every other multiuser OS does the right thing to avoid file corruption.
> so what's the 'Windows way' then?

Using either memory mapped files of overlapped io (IOCP).

It's tricky to use when you want to write the content since you must preallocate the file before you start with the writing. Appending to file just doesn't work under NT kernel since WriteFile blocks even if you use overlapped io.

Devs just need different mentality when it comes to Windows programming compared to Linux. Due the fact that everything under NT kernel is operated asynchronously you'll have to adapt your code to such concept. Meanwhile under Linux you had no other alternative for nearly 30 years (io_uring and friends) so if you wanted to be portable with minimum OS specific code then you had to implement things in synchronous way or write two separate code paths for each OS.

Guess which one is used in practice.

You can just use FILE_SHARE_WRITE if you don't want to lock out other programs...
If I know one thing about files it's that any fix beginning with "You can just..." is probably wrong.

https://danluu.com/file-consistency/

I didn't recommend you do this, I just said it's there if you really want to.
This is not due to NTFS. (And NTFS is not crap either.)

Try it on another file system if you don't believe me.

NTFS on Windows was indeed always significantly slower when doing a lot of file operations. It was noticeable even long before WSL existed: NTFS and the whole infrastructure connected to it has by design bigger maintenance cost for every opening of the file, for example. That's undeniable. One can say that NTFS could have maybe been implemented faster on some other system (with less hooks etc), but NTFS on Windows is provably slower for the file operations people do every day on Linux, that's a simple fact. Like working with big source repositories containing a lot of files (git, node.js). The people who decided to make WSL2 faster for Linux programs achieved the speedup by going full VM. I would also prefer that they worked on NTFS on Windows long enough until the file operations become faster for everybody, but it didn't happen.

That's one of the lost opportunities that the original article laments about.

Edit: a reply to dataflow’s "try your operations on another file system" answer: I don't have to, I remember: there was a time when people used FAT formatted media much more than today -- and NTFS was slower than FAT too, under the same Windows, on the same hardware, without installing anything from third parties. It can be that on Windows 10 FAT got to be comparably slow, but in earlier times, on slower machines, FAT was visibly faster compared to NTFS. But NTFS was more robust to failures, and I preferred using NTFS for anything that is not temporary. I admit that the whole Windows infrastructure around what we consider purely "a filesystem" is also a serious part of the problem, by design. And as I wrote, it's a lost opportunity that the bottlenecks weren't identified and the changes made to the benefit of everything on some future Windows.

Again: try your operations on another file system, then tell me it's NTFS that's slow.

It's not NTFS that's slow at opening files. It's the I/O subsystem. You'll see the slowness with other file systems too.

This is the correct answer. Myth that NTFS is slow should already go away.

You can already run windows on top of BTRFS if you want but it'll be painfully slow compared to linux [1].

https://twitter.com/NTDEV_/status/1327358814891470850 https://github.com/maharmstone/quibble

Note: I've explicitly said: "NTFS on Windows" and "NTFS and the whole infrastructure connected to it". Also: "NTFS was slower than FAT" was true for years. I have no experience with BTRFS, but that doesn't prove anything without knowing more details (the overhead introduced to make it work). So... it's both NTFS and the Windows "subsystems."
> So... it's both NTFS and the Windows "subsystems."

Ok, probably both on default windows installation due legacy and backward compatibility with dos names [1].

https://docs.microsoft.com/en-us/windows-server/administrati...

I remember this issue when creating few million of files inside one folder and it was extremelly slow because of 8dot3 name creation. It has to go through each filename to generate short name O(n) when this legacy feature is enabled.

After disabling 8dot3 there were no performance issues anymore.

> I have no experience with BTRFS, but that doesn't prove anything without knowing more details (the overhead introduced to make it work)

I tried to point out following: Ext4, Btrfs, Zfs, any other UNIX filesystem will be slow under Windows. NTFS or any Windows first filesystem will be slow on Linux. There is just too much of the differences in OS architecture between the NT and Linux.

> The argument about Windows not having a single, central dentry cache doesn't hold water

This was more about cache coherency, and it's not an argument, it's a trait shared by every other Linux-over-X implementation (UML, coLinux). It is fundamental to solving a problem users want - seamless integration, i.e. no opaque ext4 blob. Why doubt the reasons given by Microsoft, when they match observations not just for WSL but every other system in the same class?