Hacker News new | ask | show | jobs
by wicket 1870 days ago
With Linux having caught up with key Solaris features in recent years (DTrace -> eBPF, Zones -> Namespaces, ZFS -> ZFS on Linux), I always thought that the main reason to use Illumos would be first-class SPARC support. With that now dropped, I'm concerned that Illumos soon become irrelevant. Are there any compelling reasons left to use Illumos, other than being something for those who just want a free Solaris alternative?
5 comments

In my opinion Linux hasn't caught up.

* Namespaces don't come close to FreeBSD jails or Solaris / Illumos Zones. There is a reason Docker hosters put their Docker tenants in different hardware VM's. Because the isolation is too weak.

* Due to CDDL and GPL problems ZFS on Linux will always be hard to use making every update cycle like playing Russian roulette.

And there are other benefits. Like SMF offers nice service management while not providing half an operating system like systemd.

The problem with this jails/zones stuff is that I don't know anyone who seriously trusts jails and zones for real multitenant workloads anyways. The dealbreaker problem remains a shared kernel attack surface between tenants. It's one thing to propose that Zones are better than namespaces (they probably are), but another thing to cross the threshold where the distinction is meaningful in practice.
At Joyent, we deployed public-facing multitenant workloads based on zones (and before that, jails) for many years. We seriously trusted it -- and had serious customers who seriously depended on it. So, now you know someone!
To be fair, y'all had some serious vulnerabilities, including zone escapes and arbitrary kernel memory reads, discovered by @benmmurphy.
Yes, though I would like to believe that Ben's responsible disclosure coupled with our addressing those vulns (and auditing ourselves for similar) reflect exactly that seriousness around multitenant security. And for whatever it's worth, one of those vulnerabilities -- which was a bug in my code! -- very much informed by own thinking about the inherent unsafety of C, underscoring the appeal of Rust. So I am grateful in several dimensions!
If you have a kernel implemented in Rust, (1) you should shout that from the rooftops and (2) use whatever isolation mechanism you like on it.
To this, all I can say is that I spent from 2005-2014, and then from 2016-2020, doing nothing but security evaluations of products, probably about 60% of which were serverside multitenant SAAS systems of one form or another, and I don't remember ever evaluating (or overseeing the evaluation of) a system that relied on Jails or Zones. Lots of Docker! And, until a few years ago, multitenant Docker isolation was an infamous joke! I'm not sticking up for it!

You can look at the recent history of Linux kernel LPEs --- there has been sort of a renaissance because of mobile devices --- and count all the ways any shared-kernel multitenant system would have broken down. At the end of the day, it's not so much about predicting whether your system can get owned up (it can), so much as: "what do I need to do when there is a kernel LPE announced on my platform". If you're doing shared-kernel isolation, the right answer to that question is usually "fire drill". It's not a noodley thought-leadership kind of question; it's a simple, practical concern.

There were also tons of providers who trusted Linux containers for VPS hosting.
How'd that turn out?
I haven't heard any stories of people being hacked via container escape, but the whole VPS industry was so low-stakes that maybe customers didn't expect good isolation anyway.
And needless to say it became a billion dollar business, with a great product.
They were acquired for $170m.
I stand corrected. Still great product, business and team.
Security requirements (and awareness) have increased over the years, have they not?
They definitely have! And we had a (zones-based) public cloud through it all. On that note, Alex Wilson's description of working with Robert Mustacchi on mitigating Meltdown by adding KPTI to illumos[0] definitely merits a read!

[0] https://blog.cooperi.net/a-long-two-months

Also, tools for improving Docker for multi-tenant workloads exists, like gVisor. I don’t think equivalents exist for jails/zones really.
gVisor isn't a shared-kernel multitenant system; it's essentially kernel emulation. It's a much stronger design.
I mostly mean that it is intended to be a solution to run containers from multiple tenants on the same host. Though I do agree, being essentially a kernel in itself, it is a bit in a different wheelhouse. It still is a huge value add that you can implement something like that on top of Docker, imo.
You can run container workloads in "real" VMs too; for instance, check out Kata Containers. Containers are a way of packaging applications; confusingly, they happen to also have a reference standard runtime associated with them. But you don't have to use it.
Similar to Xen?
Much weirder than Xen.
> The dealbreaker problem remains a shared kernel attack surface between tenants.

Also, now, extremely subtle and hard-to-mitigate timing attacks between tenants.

In fairness, that's an attack class that's very difficult to eradicate even with virtualization.
> In my opinion Linux hasn't caught up.

I completely agree. I love Linux and it’s easily my preferred desktop OS but when it comes to stuff like ZFS, containerisation and other enterprise features, FreeBSD and Solaris are just more unified and consistent. A lot of it has to do with Linux being a hodgepodge of different contributors resulting in every feature effectively being a 3rd party feature. Which I think is the problem Pottering was trying to solve. And in many ways that’s quite a strength too. But ultimately it boils down to the old Perl mantra of “There's more than one way to do it” and how it’s fun for hackers but FreeBSD et al add the “but sometimes consistency is not a bad thing either” part of the mantra doesn’t too.

https://en.m.wikipedia.org/wiki/There%27s_more_than_one_way_...

> Namespaces don't come close to FreeBSD jails or Solaris / Illumos Zones. There is a reason Docker hosters put their Docker tenants in different hardware VM's. Because the isolation is too weak.

This is largely a myth, please provide an namespace-related CVE that has gone unpatched to support your argument. The reason they run as VMs is that hypervisors run on ring 0 and require higher privileges than the kernel, therefore they are naturally more secure. Like Namespaces, Zones and Jails are also managed by their respective kernels. If there were any major hosters running managed services for Zones and Jails, you can bet they would implement them in a similar way.

> Due to CDDL and GPL problems ZFS on Linux will always be hard to use making every update cycle like playing Russian roulette.

You're right in that the CDDL causes complication but I don't consider this to be a compelling reason to use Illumos. Many who want to use ZFS on Linux will use it and get it to work despite the licensing issues and complications.

> Like SMF offers nice service management while not providing half an operating system like systemd.

SMF is relatively nice (apart from the use of XML) and like you, I would not touch systemd barge pole. Despite systemd making a lot noise in major distros, there are plenty alternative distros for those of us who don't want to use it.

Don't get me wrong, I'm a Solaris guy, it made my career. I just fear that by dropping SPARC, Illumos have put the final nail in their own coffin.

> I just fear that by dropping SPARC, Illumos have put the final nail in their own coffin.

If that were true, most illumos users today would be SPARC users. There would be more than a couple of people working on SPARC support, and not merely as a part-time hobby. There would be software support for a SPARC machine that was sold some time after 2011.

Instead, something like 99% of the people running illumos are doing so on 64-bit x86 machines. Dropping SPARC support will allow us to move forward much more easily with enhancements to the dramatically more relevant x86 bits. If anything I expect it will allow us to do interesting things that would garner new interest, like using Rust to implement bits of the operating system.

Thanks for the reply. That surprises me but I'm glad to hear that x86 support is strong. It does lead me to wonder, what sort of things are people using Illumos for? Maybe it's time I checked it out. :)
> This is largely a myth, please provide an namespace-related CVE that has gone unpatched to support your argument.

What I mean is that if you use LXC namespace's as a container it is going to be an insecure container. Simply because LXC namespace's are not containers and are not going to provide a fully isolated environment. Namespace's are low-level building blocks which, together with other technologies (for example a virtualized network stack), you can use to make fully isolated containers. And that's why most hosters just took a shortcut and put the whole thing in a hardware VM to ensure tenants are fully isolated. Which I think is a shame since you also get all the overhead of a hardware VM.

So sure, you can _make_ something like jails or zones on Linux if you combine a bunch of things and provide the glue. But there is no concept of a container like jails or zones in Linux. Which leads to other problems such as there not being any tooling to mange the (non-existant) container.

> The reason they run as VMs is that hypervisors run on ring 0 and require higher privileges than the kernel, therefore they are naturally more secure.

I don't know if I fully understand what your saying here but I think you mean that with a type 2 hypervisor the hypervisors kernel runs in a more privileged mode on the CPU then the virtualized kernels it manages and that provides additional security?

I don't really see how a type 2 hypervisor would conceptually give additional security in regards to a type 1 hypervisor (where a single kernel can provide multiple OS instances such a FreeBSD with Jails or Solaris / Illumos with Zones). Everything that is not the "main" kernel always executes in a less privileged mode then the kernel executing them on the CPU. For example no user process executes on Ring 0 on a "normal" (ie. non-hypervisor) OS. With containers this is no different. Hardware virtualization doesn't give a big conceptual advantage in that regard as far as I know.

Solaris SPARC is one of the few OSes in production taming C with hardware memory tagging (ADI).

With Linux that will eventually happen in ARM, but currently only Android is adopting it, who know when it will ever come to upstream enabled by default like on Android.

>ZFS on Linux

its been said in the thread already but this was always a non-starter. Torvalds even said so himself. CDDL was the last poison pill of a dying giant who couldnt pull its foot from the well.

What we, er, the linux community, chose instead, was BTRFS. It isnt ZFS, but its made incredible strides. for most use cases, it is a reasonable and working replacement for ZFS.

> for most use cases, it is a reasonable and working replacement for ZFS.

That is a huuuuge overstatement for the current state of Btrfs. In some specific domains it is a working replacement. But for most domains it still falls far behind ZFS in terms of stability, resiliency or even ease of use.

By all means if you want to use btrfs then go for it. But the favourable comparisons people make when comparing btrfs to ZFS is a combination of wishful thinking and not having really bullied their fs into those extreme edge cases where the cracks begin to show. And frankly I’d rather depend on something that has had those assurances for 10+yrs already than have the hassle of explaining downtime to restore data on production systems.

> What we, er, the linux community, chose instead, was BTRFS. It isnt ZFS

Speak for yourself. As a part of “the Linux community” I gave btrfs a fair chance, but stopped using it because it constantly failed on me in ways no other fs had done before and didn’t protect my data.

ZFS is rock solid and I’ve never had any of the issues I had with btrfs.

So as a member of “the Linux community” you claim to unilaterally represent I put such petty license-politics aside and choose the file system which serves my needs best, and that is ZFS.

BTRFS is the only fs I recall having lost data to (rather it corrupts data, i.e. data of one file is found in another) as late as 2018.
I lost data and had to restore from 12 hour old backups one too many times with BTRFS. XFS + Ext4 for me from here on out, but that's one of the great things about Linux: lots of choices.
Lamentably, for those of us that just use linux, the lots of choices seem weird and frankly a little scary (Stories of data loss with BTRFS I have heard before.) I use the default Ext4? I think?

But as a believer in open source, I really would love it if the choices we had were so much superior to the proprietary stuff that was out there that it made using an open source OS a no brainer.

XFS all the way down...or ZFS (on FreeBSD)
What is the latest on RAID-5/6 support for BTRFS. RAID-Z has its issues (saying this as the triple- and double-parity author), but it's been stable.
As far as Ive heard it's still pretty iffy. Since you were involved with the zfs side of things what are your thoughts on the upcoming zfs draid bits? I don't have specific need for them myself but they look really attractive for building a new pool to replace my aging drives.
Unfortunately I've been completely out of the loop on the draid stuff.
>What we, er, the linux community, chose instead, was BTRFS.

Isn't that putting politic before technical excellence, something the Linux crowd is proud of? Other than in place volume expansion, there is no technical reason to choose BTRFS over ZFS (for now.)

I don't really see a killer feature from BTRFS that would persuade me to take a chance with it.

It's being pragmatic. Linux has typically placed freedom ahead of pretty much everything else.[1] All else being equal, sure you want the best technical solution. But if it doesn't fit the definition of freedom that Linux requires, how otherwise good a solution is doesn't matter. So the main 'killer' feature of BTRFS is that it fits the licensing requirements for integration into Linux. Linux has a great many problems, but being sticklers for a particular type of license isn't one of them IMO.

[1] This isn't just idealism. See Oracle v. Google for an example of what happens if you play fast and loose with licenses and a malicious actor. Google eventually won, but how many millions of dollars did that victory cost them? Oracle would love Linux developers to blunder their way into the receiving end of a lawsuit.

>Isn't that putting politic before technical excellence, something the Linux crowd is proud of?

It's not unprecedented. The adoption of systemd was forced on distros through political pressure, and not for technical reasons.

If you want a truly non-political OS community these days, I think you're basically stuck with OpenBSD. No CoC, no systemd, no political BS at all -just pure tech.

(there's other problems with OpenBSD -performance, mostly; that's why I use windows and Ubuntu instead. But the way they run things is admirable IMO. Blatant BS isn't tolerated.)

systemd wasn't "forced" on distros. The distros adopted it because they liked it.

The thing is that systemd did something quite clever -- it sold itself to the people actually building distributions, which are the people that actually matter the most in regards what system software gets used. It made their jobs easier and less annoying in many ways.

As somebody who's done a lot of packaging and writing of SysV scripts, I can tell you that it's a tiresome and annoying task even for a small amount of software, let alone a whole distro. At that point the unix philosophy loses its luster quite a bit.

> It's not unprecedented. The adoption of systemd was forced on distros through political pressure, and not for technical reasons.

Sorry, I'm going to slag on this.

Anyone could have put in the work to make a better init experience. No one put in that work.

System Management Facility (SMF) existed on Solaris since 2005 (systemd didn't appear until 2011?). launchd on OS X dates to a similar time. Someone could have copied them--no one did.

Even once it became clear that systemd was going through, still nobody could muster the work to put together a viable alternative.

Where's the "meritocracy through code" Linux mantra in all of this?

You can say what you want about Poettering, but he put in the work to write the code. Nobody else did.

Perhaps the problem is that an init system is a metric boatload of finicky code that nobody had the guts or skills to drive to completion?

And for the old-init bigots, sorry, that wasn't working in spite of what you claim. The fact that Windows, OS X, Solaris, etc. (and then systemd) all converged on essentially the same design is because of common needs on modern computers.

There was upstart, but it didn't work very well, mostly because it pre-dated the kernel mechanisms like cgroups that enabled a reliable service manager to be implemented. So it was only used in its sysvinit-compatible mode, not its native upstart mode.

https://bugs.launchpad.net/upstart/+bug/406397/comments/21 https://bugs.launchpad.net/upstart/+bug/447654/comments/6

The Linux Community is working on BTRFS but if ZFS emerged with a GPLv2 compatible license tomorrow BTRFS would likely be moribund.
Yup. If openzfs was gpl-ed tomorrow morning, btrfs would be dead by tomorrow at lunch time.
When filesystem integrity matters, the filesystem matters more than the OS.

While I mostly use Linux these days, for file servers it must be ZFS, which means whichever OS has first-class support for ZFS. I'm still on Illumos but perhaps will move to FreeBSD at some point.

This is why FreeBSD rebasing its ZFS fork on ZFS-on-Linux made me so scared for the future of FreeBSD. Their one major advantage over Linux and they didn't have the developers to maintain their fork themselves.
ZFS will always be a smoother experience on FreeBSD as opposed to Linux because FreeBSD endorses it. Thus the user land and documentation is written assuming you’re running ZFS. As opposed to Linux where some distros might ship pre-compiled binaries but everything is written assuming you’re not running ZFS. Thus everything takes that extra couple of steps to set up, fix, and maintain.

For example, if you want to use ZFS as a storage for containers on Linux, you have to spend hours hunting around for some poorly maintained 3rd party shell scripts or build some tooling yourself. Whereas on FreeBSD all the tooling around Jails is built with ZFS in mind.

This is why platforms like FreeBSD feel more harmonious than Linux. Not because Linux can’t do the job but because there are so many different contributors with their own unique preferences that Linux is essentially loose Lego pieces with no instructions. Whereas FreeBSD has the same org who manage the kernel, user land and who also push ZFS.

And I say this as someone who loves Linux. There’s room for both Linux and FreeBSD in this world :)

I think "smoother experience on FreeBSD" is a myth -

The standard volume manager on FreeBSD is vinum/geom; ZFS ships its entire separate volume manager to the host OS, so you can't use mount/umount to control mounting a ZFS volume. Maybe it would be okay to move entirely over to ZFS's volume manager but it only supports ZFS's own filesystem, you can't use the ZFS volume manager with a normal FreeBSD UFS2 partition.

In both Linux and FreeBSD, ZFS's bolt-on ARC competes with the kernel's actual page cache for resources instead of properly integrating with it.

It's an out-of-tree filesystem for both OSes. Sure FreeBSD periodically imports it into master from OpenZFS (née ZoL), but all development happens elsewhere, and the SPL is still trying to emulate a Solaris interface on top of both OSes.

Is there any more concrete example of how ZFS is actually better integrated on FreeBSD compared to Linux, say Ubuntu? It takes ZFS snapshots automatically during apt upgrades, root-on-ZFS is a default installer option, etc.

Coincidentally there was a discussion about this yesterday. I agree with a lot of what was posted in it so might be easier to share that: https://news.ycombinator.com/item?id=27059551

This branch in particular addresses your points: https://news.ycombinator.com/item?id=27062069

Has the latest version of Ubuntu finally made mirrored ZFS root pools painless? Because that was anything but a native out of the box experience (compared to setting up the same on FreeBSD) and that has bit me several times.

I've use ZFS on both FreeBSD and Linux for years and while Ubuntu is closing the gap, ZFS has been the default recommended file system on FreeBSD for close on 10 years already. So it's bound to feel more like a native experience on FreeBSD.

> In a review in DistroWatch, Jesse Smith detailed a number of problems found in testing this release, including boot issues, the decision to have Ubuntu Software only offer Snaps, which are few in number, slow, use a lot of memory and do not integrate well. He also criticized the ZFS file system for not working right and the lack of Flatpak support. He concluded, "these issues, along with the slow boot times and spotty wireless network access, gave me a very poor impression of Ubuntu 20.04. This was especially disappointing since just six months ago I had a positive experience with Xubuntu 19.10, which was also running on ZFS. My experience this week was frustrating - slow, buggy, and multiple components felt incomplete. This is, in my subjective opinion, a poor showing and a surprisingly unpolished one considering Canonical plans to support this release for the next five years."

This was Ubuntu's latest LTS release, which is less than a year old. Granted not all of the criticism levelled against it are ZFS related and granted that's just another persons anecdotal report but it mirror the same experiences everyone else, aside from yourself it seems, raises when switching between Linux and FreeBSD for ZFS storage.

I don't post this as a hater though. I, like others, do still run Ubuntu Server + ZFS for some systems (particularly where I wanted ZFS + Docker) and those systems do run well. But I can't deny that everything requires just a little more effort to get right on Linux because there isn't the assumption you're running ZFS where as on FreeBSD it more or less pre-configured to use ZFS right out of the box because that's the expectation. eg FreeBSD containers tooling is already written to support ZFS where as Linux container tooling isn't.

This is why people talk about a smoother experience on FreeBSD. The file system itself is the same code base and performs largely the same. But it's all the stuff around the edge that is built with the assumption of ZFS on FreeBSD that makes things feel a little less hacked together with duct tape.

My main issue with ZFS is the integrated nature - like systemd for filesystems. My 'alternative' for ZFS isn't BTRFS (awful performance characteristics for my workloads) but LVM coupled with ext4 and mdraid. I get snapshots, reliability, performance and a 'real UNIX' composable toolchain. I miss out on data checksums.
In principle I dislike the coupling of volume manager, raid and filesystem.

But I still think zfs gets most things right; I see the argument for a concistent system managing caching/logs, volumes, data integrity, discard support, compression, snapshots and encryption.

The fact that it's the first serious, open, cross platform solution (Linux, bsd, Mac, winnt) that provides encryption, integrity and filesystem is a nice bonus.

And the integration of snapshots and fs dumps via zfs send/receive is beautiful.

I think zfs makes sense like one fat layer - networking can go below (drdb, iscsi) or on top (iscsi, nfs, cifs).

Encryption need to be somewhat holistic - for making sane performance and data leaking tradeoffs.

Having run all these thing in prod (except BTRFS, it ate a mirror on my desktop), I’ll say that even the LVM + + is so much more hacky than geom on FreeBSD which feels much more ‘Unix’ with a designed composable interface.

Although, I do prefer the durability of XFS or ext4 (depending on workload) vs UFS, and the setup you described is totally maintainable.

No compression..
>Torvalds even said so himself.

Torvald's comment about ZFS was as uninformed as it gets...and he calls himself an FS-Guy ;(

His comment was more on the wisdom (or otherwise) of running an out of tree filesystem. I think its hard to disagree with him. He went on to say you would never be able to merge the ZFS tree with Linux. Again he's the one who would know what code gets in Linux. His only actual comment against ZFS was that benchmarks didn't look great - which is unsurprising given all the extra work ZFS is doing wrt data integrity than other filesystems in production use.

https://www.realworldtech.com/forum/?threadid=189711&curpost...

From the link:

>[ZFS] was always more of a buzzword than anything else, I feel,

This is deeply ignorant. I feel that Linux has been handicapped by the fact that many developers have never done any serious enterprise administration and thus not having clear understanding of the needs of a set of their users.

Not everything needs to be in the Linux Kernel (and honestly i don't care if it is), looking at the past "linux-sound-system-tragedy" i would say, to make something outside linux is often much better (not 1000's different peoples who thinks it's better the other way around, and you are full of sh* anyway).

>His only actual comment against ZFS was that benchmarks didn't look great

What benchmark? Mines are looking pretty good, with a preheated Arc and with L2 especially...actually much better than any HW-Raid. Compared to Linus, there are Institutes with a bit more than a single 3GB git repository, and the crazy stuff...they need verified backups.

https://computing.llnl.gov/projects/zfs-lustre

If he ever has to improve a Lustre Filesystem of 55 petabyte with ZFS, he can come back, otherwise Linus...shut-up* and be happy with your ext4 (nothing against that one).

* A homage to the old linus-style of having a discussion.

Not really. SmartOS was nice for a while but tbh you may as well go with openstack or proxmox these days.

I used it for hosting a lot of java over the years but these days everyone wants a k8s endpoint and really the kind of hypervisor you are running doesn’t really make a difference.

Shame, it was nice tech.

As someone who has run both side by side on literally identical hardware... no, Linux has not caught up.

Lxc containers vs Solaris Zones... zones clearly wins.

SMF vs systemd (I know that you didn't include this, but it matters)... SMF is clearly superior as well.