Hacker News new | ask | show | jobs
Pengwin – A Linux Distro Optimized for WSL Based on Debian (github.com)
104 points by edgr 2629 days ago
12 comments

This is fundamentally the wrong direction to go.

The clue is that MSWin running in a VM on Linux is faster than on hardware. The way forward is to boot Linux, or even something else, to manage hardware, memory, and filesystems, and cut down MSWin to run in a container on it. That way MSWin relies on the underlying OS to do things MSWin has proven to be just not very good at. MSWin runs programs written for it, reliably the same as non-hosted MSW, but is not subject to randomizing effects of manufacturers' drivers and MS's historically poor buffer and process management.

If you want MSWin to manage the screen, it can provide a way for an underlying OS program to work in a window it provides, and connect UI and clipboard events back to it. But MSWin is not really so great at display management, either, so it might be better for the underlying OS to manage that, too, as is done with VMs on X today.

> The clue is that MSWin running in a VM on Linux is faster than on hardware.

I'll join the fray in expressing skepticism here. Source?

Anyway, I think this WSL distro is pretty cool. I've set up WSL on my Windows computer for kicks and while it was neat I never quite got it to work for the rust development and environment I wanted. I ended up just doing it in Windows.

But if this sets up rust and VS Code (the editor I wanted to use) painlessly, then I'll have to give it another shot.

The filesystem is notably slower on Windows. It's easy to test it out for yourself; clone any big repo (eg: linux) in WSL and run `git status`. The Microsoft engineers have already done a lot of heroic efforts with WSL. The issue seems to be more inherent to the design and really hard to solve.

ref: https://github.com/Microsoft/WSL/issues/873

To be clear, this is for a specific use-case. I don't believe that all applications will have similar performance profiles and that Linux wins over Windows in all of the cases like OP seems to suggest.

It's definitely true in my experience that disk operations are slower in Windows. Folks on this thread are getting flack for poor comparisons (IMO Windows is worse at lots of small files), but even when you remove that, and say, look at mostly portable code that deals with a single file, as I have seen using sqlite as an example, it does worse.

Not to mention Windows has built in antivirus reacting to all your filesystem activity by default...

It's entirely possible to disable this by adding an exception for the WSL directory in Windows Defender. In my experience, this vastly improves IO from within linux.
Good luck convincing your employer to do this. One feature request from me in case someone from MSFT is reading - please design things so that no file modify hooks are invoked at all for files under wsl directory. Files in wsl should be in a secure sandbox by default outside the reach of any antivirus.

I have heard that system devs at Microsoft have Admin mode enabled. I think that is why they dont realise how bad the dev experience is on windows when you dont have admin. You cant turn windows indexing off, you cant exclude things from anti virus, you cant launch Application Verifier. All dev workflows should work without Admin.

I wasn't talking about WSL at all actually. I have never used it. Disk performance is just worse on Windows. With or without defender. (And defender is the default.)
The WSL filesystem. That says nothing about the question.
Windows supports multiple file systems; Microsoft could solve this by adding ext4 support directly into Windows and running WSL from a separate partition.
No, because you still suffer from Windows buffer management, Windows I/O scheduling, and Windows kernel-thread scheduling.

It has been tried.

I don’t believe your assertion. A quick check on the latest system-taxing application (aka a game), confirms the exact opposite. Even a cross-platform game, compiled natively for both OSes, runs faster on Windows in most cases. How do you square these observations with your assertion.
I assume you are limiting your definition of "system-taxing" to playing games specifically?

As other people have noted, filesystem performance is often better under Linux.

I also will point out that at least some rendering tasks can be faster under Linux as well:

http://blog.thepixelary.com/post/167616662857/improving-perf...

Just like with the filesystem performance discrepancy, this article seems to point out WDDM (Windows drivers) as the reason for this rendering discrepancy.

And while we are on gaming (since I am a gamer as well!) when comparing games that run natively on Linux as well as Windows, Linux often has a performance edge over Windows:

https://www.phoronix.com/scan.php?page=news_item&px=Win10-Li...

Of course most games do not support Linux natively so in order to game on Linux you have to use WINE, Proton, or full virtualization to get a non-native game to run and this extra middleware layer adds overhead. But this doesn't mean gaming performance on Linux is worse than Windows; as comparing native-to-native performance shows.

Using git as a filesystem performance metric is horribly flawed -- git was written from the beginning to perform perfectly on Linux. It's the most native linux application I can think of.

(Not because it's using secret system calls, but because its design was vetted and performance tailored for Linux.)

I didn't really specify git myself, but it doesn't seem like NTFS is a particularly high performance filesystem these days:

https://www.phoronix.com/scan.php?page=news_item&px=Linux-4....

https://www.makeuseof.com/tag/linux-transfer-files-faster-wi...

https://www.tomshardware.com/reviews/ubuntu-oneiric-ocelot-b...

https://superuser.com/questions/1124472/why-is-linux-30x-fas...

I also want to point out that just because NTFS isn't the fastest filesystem on the block doesn't mean it is a bad one. From what I can gather just casually googling it seems like NTFS emphasizes safety at the cost of performance at times. So depending on your workload and desired behavior NTFS might be the file system of choice for you. It's also worth noting that IF NTFS is intended to be more user-safe it sort of makes sense why may be the choice for the most common OS which aims to serve all users (not just technical ones).

I thought it was mainly because windows doesn't have a dir cache in the kernel. The WSL filesystem performance thread is linked above which goes into this.

And that can't be fixed without involving 10 different teams and potentially outside partners.

And there's something wrong with the filesystem, which can't be fixed without the same deal except that will definitely require the cooperation of the outside partners because apparently you can write an extension for NTFS and fixing it would break the existing extensions, which Microsoft doesn't have the source for and doesn't ship.

And there's a thousand other paper cuts, and a very large fraction of those would require cross-team coordination and testing, which is a horrible time sink in a large company.

Fundamentally, Windows has poor performance when dealing with lots of little files, which is exactly what compiling large code bases involves. And to top it all off, their build system, conspires against their own OS. You'll get better performance using cmake + ninja (on Windows that is) then you will with cmake + visual studio.

Edit: also as a practical matter, if all my company's code is in git, and all outside code bases I work with are in git, I kinda would like decent git performance regardless of the origins of the tool.

I use windows as an outlook appliance at work, and as a gaming appliance at home. Otherwise I do everything in linux and am much happier.

It is also extremely slow on big files. Try opening a file, seeking out to 10GB, and writing 4KB. Instant, on any Unix. Better have something else planned for the afternoon, on Windows.
Games really aren't a good benchmark for overall system performance as they are generally optimized for Windows and non-Windows ports often receive a lot less attention. Also, most games are primarily stressing one thing - the graphics stack - and not I/O, scheduling, memory management etc.
Which game? I don't agree with the position of the person to whom you are responding, but there's almost certainly much more optimization put into the Windows version.
Why is “much more optimization” a limitation? What is preventing devs from optimizing the linux version? If the OS is more performant that should require less effort, no?
My point is more that a game's performance is not necessarily a reliable way to measure relative performance of the OS. What game was it?
Windows absolutely dwarfs Linux when it comes to gaming market share. Game developers will factor in return on investment and therefore a proportional amount time will be spent on optimizing the Linux port.
> MSWin running in a VM on Linux is faster than on hardware.

That's funny. I didn't know that. Any data source you can share with me?

It seems a highly counterintuitive thing to say but I have anacdotelly noticed the same when running XP bare metal vs inside a VM running in Linux on the same hardware. Around that era I have hard others notice the same on their PCs as well - and that’s where the origins of this meme comes from. I didn’t spend any time investigating why Windows might perform differently since running the BM was just to save dual booting rather than any performance related reasons however I think it was primarily around file system operations where the difference was noticeable (this was before SSDs were ubiquitous).

Microsoft have invested considerable effort on improving the NT subsystem in subsequent releases of Windows so I would be surprised if that meme was still true. Plus SSDs likely even the playing field too.

Apps certainly have been for a long time.

IE6 was famously (in my parts) known for being perceivably faster under Wine.

Wine is not a VM though.
I thought wine is not an emulator:)
> That way MSWin relies on the underlying OS to do things MSWin has proven to be just not very good at

Could you expand on what some of these things are?

File operations. Corralling vendors to code reliable, non-crashy drivers. Input device event handling. Thread scheduling. I/O scheduling. Anything where tuning would have involved work by more than one product group.
Even if your claim that running windows in a VM on hardware that have no Linux support, only windows support, is somehow feasible and faster... That kinda misses the point? Freebsd and windows have Linux binary support. I don't see why having a project catering to one of those would be "the wrong approach"? There's been some pretty successful efforts in helping games run under wine, as a "reverse" example.
I think you'd want to qualify just which parts of which particular Windows applications running in a VM on Linux are faster. I see you refer to the filesystem driver, and that makes sense to me. However, if it's anything related to pro audio or 3D applications native to Windows, it would not square with my experience. Stability is also an important word to bring up here -- I haven't had a bad experience with WINE on Linux if it works. If it doesn't, I get odd behavior and crashing.
I can run three displays natively on my Win 10 machine pretty effectively on a small Dell laptop. Linux is pretty poor at multi-screen support from my experience although it might have improved.

As I have a corporate laptop running Win10 I can't install anything except use Cygwin or WSL. It doesn't allow me to run a Linux VM as it has networking issues upon installing Hyper-V or Virtualbox.

So I'm stuck with WSL on Win 10 and appreciate the efforts of others to get X apps running on it.

It works well enough for me. I have no great love for Windows, but then the only available alternatives have proven worse for me (both osx & linux each used f/t for several years).

There are no good desktop os's in 2019, so they're all 'wrong'. Being 'wrong' therefore cannot be a good argument against their use. All we can do is find the best compromise for our purposes.

The conceptual foundation to do something like this already exists in the form of Drawbridge and Hyper-V, there just needs to be a political will within Microsoft to make it happen.
I don't know if it's the wrong direction or not. I suppose it depends on what your goals are.

All I can say is that I don't see the need for any of it.

I always have difficulties with drivers on linux. Never on Windows. Until they fix that I cannot use anything other than windows as my base system. I don't care if linux handles things slightly better
Difficulties with drivers in terms of quality, availability or both? Regarding availability (and often quality) the only real/sustainable solution is for hardware manufacturers (who I think you are referring to by "they") to improve their Linux support.
There's nothing to fix. The devices are either open and a dev has decided to write his own, or the devices are closed and the drivers are provided by their respective creators.

If it's the former, pay a dev, if it's the latter, use a different device.

There's nothing to fix. The fire is either of a type that may be extinguished and a trained firefighter decided to put this type of fire out for everyone, or the fire is unable to be extinguished as was the intention of the entity that started it.

If it's the former, find the correct firefighter, if it's the latter, use a different fire.

There's nothing to fix. The construction material is either of a type that does not spontaneously combust, and a carpenter can make you a fine chair out of it, or the material is the type to spontaneously combust, and if you choose to buy a chair made of that material you will soon find your pants are on fire.

If it's the former, pay a contractor to build you a chair, or make one yourself. If it's the latter, use a different construction material.

There is nothing to cure. The pathogen is either of a type that is curable, and a licensed medical professional can prepare a treatment regimen, or the pathogen will liquefy all of your internal organs within minutes, and if you choose to expose yourself to the latter pathogen, you will find yourself having a bad day very quickly.

If it's the former, pay a licensed medical professional or read online forums and do it yourself. If it's the latter, contact your nearest funeral home.

Or! Hear me out here: I use Windows.

Lock-in sucks, but it's real.

Yes, but you can run it in a VM. Microsoft could make it run in a container, instead, if they cared about your time. They might do it to make their Azure servers faster, but have little incentive to tell you.
I'm a paying customer that's been running my main dev system on WSL and Pengwin for four months now. I've got it on my ThinkPad and my beefy desktop and sync the two environments using .dotfiles. There is a bit of an I/O slowdown compared to native (most noticable when npm installing a billion tiny modules), but generally everything is quite functional. No driver issues as when I've run Linux. No terrible Mac keyboard. I run VSCode under WSL to keep my dev toolchain on the Linux side. The main thing that forces me to dual boot to Ubuntu is CUDA development, but I find that most of this sort of work has shifted to Jupyter Notebook.
What's the main benefit of this distro over the standard Ubuntu usually offered for WSL?
Can someone please enumerate how this is different from Ubuntu 18.04 (from Canonical) on WSL? Other than distro differences, I see a LOT of marketing speak in the Github ReadMe, but I can't tell if they really offer any advantage over competing offerings.
On a technical level Ubuntu 18.04 is based on a mix of Debian stable and testing with Canonical's additions.

Pengwin is primarily Debian testing, with some stable and some unstable here and there.

Pengwin configures dozens of settings for WSL and has optional WSL-specific features.

Settings are delivered by pengwin-base and features can be configured with pengwin-setup.

You could probably spend hours implementing these features on your own each time you have to install Ubuntu on a new Windows device.

But by purching Pengwin you support open source indie devs that handle it for you, answer bug reports, constantly add new featres, and are available for support.

What are these settings you keep referring to?

The readme is very vague about what Pengwin's features and differences actually are, and gives me no solid reasons to switch from Ubuntu, which already seems well-suited to WSL. Am I correct that trying Pengwin requires paying and installing through the Microsoft Store, even though it's open source?

Interesting:

“Grants/Bounties

If you have an idea for a new feature you would like to see implemented in Pengwin and can implement it yourself given the funding, we are now accepting grant/bounty proposals. Grants are currently available for $50-$500 USD based on complexity.”

The upper limit is quite demotivating, from my point of view.

Demotivating to whom? I imagine the intent is to avoid creating an impression that anyone has the power to make their pet issue priority #1 for the dev team just by throwing enough money at it.
But it's not for a "dev team" at all, the way I understood it, it's:

"Learn how you can earn paid grants improving Pengwin."

"Your proposal will be promptly evaluated" ... "Your work sample and GitHub history will be evaluated to determine if you have the technical competancy to implement the proposed change and deliver timely." "If approved, you will recieve a simple agreement covering the scope of the work you will be doing which you must acknowledge and return and then you may commence work."

You are correct. This is for contributors to get paid to add features they want to see in the project. I wish we had more applications under the grant program.
Is WSL multithreaded? I love the ease of use that WSL gives, but it seems slow to cygwin on disk access. I think this is planned in a future fox by making the WSL directory to the disk instead of in a container file. I went back to cygwin for faster speed of scripts processing files.
I noticed file IO was slow too. My solution is to do most of my work in headless Linux VMs, using WSL as the interface to manage and connect to those VMs. This way I get the bash shell and ssh client I love but get isolated environments for my projects/clients. I run samba on the Linux VMs, with mapped drives on the Windows host so that I can edit files using VS Code. Outwardly it might look a little convoluted but I get isolation, performance, convinience and an ultra smooth host OS experience.

I say this coming from around a decade of running Linux or MacOS on the desktop. This just works.

What do you use as your VM software? I take it VMWare has a pretty decent IO performance to host's files through HGFS when VirtualBox isn't as performing.
I'm actually using Virtualbox. In my case, the guest doesn't see the host's files - it's the other way around. 80% of the file operations happen within the VM, with the rest being in VS Code on the host (via a mapped drive pointing to the a samba share on the guest). This lets me preserve file permissions and allows me to backup or migrate my entire project by just copying the VM disk images. It also gives me all the performance I need (and with a ton of ram, my database workload rarely hits the disk, which is a fast nvme ssd anyway).

I'm not running a Windows Insider build, and the current version of WSL doesn't accurately represent the posix permissions on files seen via the /mnt/c mount (DrvFs). The next version resolves this by storing posix permissions and other meta data. In my case, being unable to manipulate the permissions causes us some problems. It makes a lot of sense to work on the files within the VM, in a native ext4 volume.

It's only slow at file IO, yeah.
File IO can be vastly improved by excluding the WSL OS directory from Windows Defender real-time protection. Here's a gist that has powershell and GUI instructions for accomplishing this: https://gist.github.com/noelbundick/9c804a710eb76e1d6a234b14...
This is possible but not recommended unless you are very cautious.

Windows Defender protects from Windows malware entering via WSL and Linux malware as well.

For example Windows Defender has caught comprimised npm modules inside WSL.

That is really interesting to hear! I am quite cautious about what I do in any environment, of course. How would you say WSL without windows defender compares (security-wise) with say, running equivalent operations with homebrew or macports in OS X? Or just in vanilla bare-metal linux?
There aren’t that many malware that are worse than antivirus.

What I don’t understand is why antivirus has to second guess the user.

I tried looking around but didn't see anything about it.

What makes this optimized for WSL?

We alter dozens of settings to defaults that make sense for the WSL environment. Unlike the other distributions available for WSL, Pengwin is designed for WSL first.
Any examples of the optimizations?
Scroll down to Features on the linked page..
> Pengwin is also the first Linux distribution pre-configured and optimized to run specifically on Windows® Subsystem for Linux, a Microsoft-supported feature of Windows 10 and Windows Server 2019.

Yes. What kinds of optimizations make it optimized? This is my original question. Telling me the equivalent of "read the article" changes nothing when I've already done so and that it doesn't say what's been changed to make it optimized.

I've yet to carefully go through all 900 commits of with subject lines almost entirely of pull request #s, but one would think if whole repo's title was how it was "optimized" they could substantiate what they mean or what they did.

I'm not sure if/what they changed in the distribution itself, but tooling optimizations are in the list:

* Pengwin includes wslu, a set of useful open-source utilities for interacting between WSL and Windows 10

* Manage your Microsoft Windows and Azure deployments with PowerShell and azure-cli, command line tools for Azure.

* Enable/disable Windows Explorer shell integration.

* Configure experimental GUI settings, including a Windows 10 theme for your Linux applications, HiDPI support and international input methods.

* Create a secure bridge to Docker running on Windows.

* Support for many Linux graphical applications with no need to configure display or libGL in Pengwin. (Requires a Windows-based X server, such as X410.)

* Pengwin provides faster patching for WSL-specific bugs than any upstream Linux distro available on WSL.

Started using this yesterday, with the X410 X11 install.

The nicest thing has to be running IntelliJ from X11 and have it see the linux filesystem.

I can run a Yubikey with weasel-pageant and get GPG signing and SSH keychain access to github through it -- it pops up a Windows PIN entry dialog box and then works fine thereafter. You can use wslutilities (wslusc) to set up windows shortcuts to your Linux applications. I use ConEmu as my shell and it's fine.

There is a problem with HiDPI screens. I've worked around this by increasing the font size in IntelliJ, but VS Code shows up as teensy tiny.

The worst part has to be the documentation. The docs for "pengwin-setup" is basically a bunch of screenshots on their blog.

the name is very clever.
My thought too. No matter what the implementation or ideas behind it... that's a killer name.
Hahaha such a great name yeah!
The name and logo were developed in consultantion with Dennis Bednarz. https://twitter.com/DennisBednarz
Is this actually a Linux distro, though? I've long held that "WSL" is a misnomer; it happens to be compatible with binaries compiled for Linux, but given that no Linux code is actually present, this would really be Debian GNU/NT.
What? WSL emulates linux syscalls and vfs, how is this a misnomer?
It's a misnomer because it's not actually Linux (nor does it even wrap/include Linux in any way), just like how Wine isn't actually Windows (if it was Linux, then Microsoft is blatantly violating the GPL). Most charitably, it's less "Windows Subsystem for Linux" and more "Windows Subsystem for Programs Compiled for Linux".
I'm so confused right now. I looked at the various links and cant believe I still can't figure this out. The kernel is windows or linux? How do I get or run this? I install windows and then this? Someone help me out here!
WSL is a Windows 10 feature that allows you to run Linux applications on the NT kernel. It's not a 100% complete implementation of every Linux kernel API (yet?) but it's far enough along that you can run a lot of interesting software.

https://www.hanselman.com/blog/TheYearOfLinuxOnTheWindowsDes...

Pengwin looks like a Linux distro that removes the parts of a traditional distro that don't make sense in WSL and streamlines things that are still somewhat difficult in WSL, like getting X applications running.

I get it that WSL is easy for anyone to use but why not just run a real Linux in VMware or VirtualBox and you'll have 100% compatible Linux environment.

Prebuilt OS image is just a single download and deploy.

Because I don't need a 100% compatible environment. WSL has better host integration and better performance in my experience, and it runs my development stack fine.
Was going to say much the same thing, plus it starts instantaneously, vs a minimum of several seconds for a VM
WSL is Windows Subsystem for Linux, and it is part of Windows. https://docs.microsoft.com/en-us/windows/wsl/install-win10 It translates Linux system calls into Windows.

This allows you to run a Linux userspace on Windows. The Windows store has (for free) installable distros for Ubuntu, Debian, and others.

Essentially WSL is to Linux what WINE has been for years to Windows: it translates Linux system calls natively to Windows ones so that Linux binaries can be executed on Windows without any recompiling or virtualization. Technically is a great milestone, but there are some risks: suppose one day Microsoft implements a way to use Windows GUI elements or other resources directly from Linux binaries and people starts writing "Linux" software that uses the new capabilities, de facto requiring WSL (ie Windows) to be run, how many Linux users would jump ship? My only fear is that Microsoft could be planning to do just that.
That seems improbable, although the reverse is happening. When Flatpak support is complete, it'll be possible to have desktop application packages that just work on both Windows and Linux.
I guess the “extend” phase is about to begin.
Oh boy, similar name to me haha! ;)