| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dragontamer 2888 days ago

> Seriously, yes. If there was ever a time to rethink OS design, surely this is it.

Why?

Traditional servers are persistent: they never turn off. 500+ days of uptime is typical. And today, with VMs which at worst... hibernate... it seems like "never turning off" might be the norm.

4 comments

stouset 2888 days ago

On the contrary, as a security professional I’d be thrilled if servers had a lifespan of hours instead of weeks or months. Reimaging VMs/containers/machines from scratch frequently gives so many advantages.

When OS, system, or library updates happen, you can easily launch replacement servers on the updated stack, put them in the rotation, and decommission the old ones. This is so much simpler than trying to run OS upgrades in-place across an entire fleet. The longer a machine has been running between reboots, the lower my belief in its odds of upgrading and restarting cleanly.

Further, this regularly tests your load balancing setup and pretty much fundamentally gives you capacity to scale up and down as load permits. Problems will be discovered early on, instead of during crunch time when you have to scale or when a few of your machines go offline during peak hours.

Security-wise, you don’t just get the benefit of fast, regular updates. But you also get assurances that users haven’t left stale data like unencrypted database exports, PII dumps, etc. lying around. Go on a long-lived machine some day and check out users’ home directories. That shit is a gold mine if someone who wants to do harm gets on your systems.

Not to mention regular reimaging makes it harder for an attacker to establish a permanent foothold in your infra.

None of this has anything to do with fast persistent storage, but I sincerely hope the era of 500-day uptimes is waning.

link

mycall 2888 days ago

You forgot, frequent rebuilds kill off any intrusion as the world reflashes -- unless they get into IoT or microcontroller packages.

link

stouset 2887 days ago

I did mention that it makes it harder for an attacker to keep a foothold in your infrastructure, but I think I wasn't as clear as I wanted to be.

But yeah, it's bad that an attacker has been able to get to a critical system, but it's a phenomenal defense if any of their beacons or remote access tools last at most a few hours or days before being wiped. This makes an attacker's life much harder.

link

dragontamer 2888 days ago

> None of this has anything to do with fast persistent storage, but I sincerely hope the era of 500-day uptimes is waning.

On the contrary. Persistent memory means that infinite uptime is the future. Which, as you note, is difficult. Resetting the OS every now and then to a known state is a good practice, although disruptive to a lot of workflows.

If anything, I consider your post to be an argument AGAINST persistent memory.

link

stouset 2887 days ago

Persistent memory might enable those sorts of uptimes, but it doesn't inherently mandate it.

link

magicalhippo 2888 days ago

But traditional operating systems still assume RAM contents is volatile (because currently it is), most filesystems assume disks are glacially slow etc.

A traditional spinning rust HDD has an effective latency of ~10 ms. The NVME version of the 3DXP has an effective latency of ~50 us, or two orders of magnitude better. Not sure how low the DIMM version will go, but maybe another order of magnitude?

If so, we're talking three orders of magnitude difference. That would radically affect the assumptions going into storage algorithms. Suddenly you can no longer spend millions of instructions trying to avoid I/O. Batching of I/O is also not needed to the same degree. Complex syncing of memory and disk is not needed. Etc etc.

link

dragontamer 2887 days ago

> But traditional operating systems still assume RAM contents is volatile (because currently it is)

RAM is only volatile on startup. Certainly not when a VM hibernates and comes back.

link

magicalhippo 2887 days ago

> RAM is only volatile on startup.

No it isn't. Anything that needs to survive a power cycle needs to go to non-volatile storage. And this is assumed to be very, very slow.

link

glangdale 2888 days ago

I don't think "computers stay up a long time these days" is an argument against doing OS research on order-of-magnitude-faster, byte-addressable persistent storage.

We seem to be doing pretty well with a bunch of abstractions from the 1970s, as well as with the idea of just building giant trapdoors into our hardware whenever these abstractions fail (e.g. most databases, DPDK in the network space, etc). It's not a crisis. It just seems like a pretty good time to do some basic OS research (aside from all the usual headwinds for that, e.g. massive complexity of underlying hardware, difficulty finding meaningful workloads for a "toy" OS, etc).

link

scurvy 2888 days ago

Your ideas around uptime are still in the 70's. Systemd updates require reboots. Then there's Spectre and Meltdown BIOS updates, gotta reboot for those. Oh and the SSD and NIC firmware as well.

To think we formerly only had one devil in glibc. Now everything is constantly being updated and it's fine. We've moved on from the uptime as phallic measuring stick mantra. Patch, reboot, and stay secure.

link