Hacker News new | ask | show | jobs
A NetBSD/amd64 guest can now boot in 18ms (old.reddit.com)
112 points by bleusink 881 days ago
6 comments

I'm curious as to why this isn't 0ms for a VM. The entire state of the system can be known ahead of time. Why does the kernel need to do any kind of dynamic initialization? Why aren't all data structures and variables statically assigned to proper values for the given VM. So at VM start up time it simply enters the main dispatch loop.
Mostly because that's not how the kernel is designed. While what you're saying is true, it would require a fundamentally different kernel design. The kernel as it is today is designed to work on a variety of devices, not just VMs, so it behaves in a way in which it doesn't have specific knowledge of the device prior to initialization. To do what you're suggesting, the kernel would need to have a special mode where it supports injection of this information from the VM host in advance of initialization, and there's probably a bunch of possible security issues involved as well.
You don't need kernel support, just freeze it's state right after the boot and clone it afterwards.

Boot time will be the setup of shadow page tables and other hypervisor structures at new offsets. Plus a few tricks to avoid reproducible RNGs.

> […] freeze it's state right after the boot and clone it afterwards.

You can't do that easily as devices can be attached, detached or reattached dynamically whilst the VM is running and when it is shut down.

In lieu of the hardware / VM support for device trees, with the existing design, the kernel has to probe each device upon every boot.

Well it's not an ordinary boot of course, it's a clone and the state on both side of the fence should be faithfully cloned including virtual devices. The guest kernel will have no way of knowing whether it was resumed once or twice.

Applicability is limited, but works well to spin up serverless compute on-demand.

NetBSD can just trim the kernel down to Virtio support and not much more.
I wonder the same thing for my regular OS too. Sure the first time it boots on this hardware, it needs to see what's what. But the subsequent hundred/thousand boots, at least for my usage, will be identical.
Before you can reload some previous known state, you need to initialize the hardware, load the firmwares and probe the devices (USB, SATA, PCI), discover the storage volumes and the partition tables. All these steps take time.
That's kind of the point of Windows' Fast Boot option. The base OS pretty much just hibernates at a normal shutdown. No point in re-initializing and restarting all the base services.
Is it Windows or BIOS/UEFI option? Will it work on Linux, *BSDs?
There is also often a UEFI option for a "fast boot" as well, where it skips some steps in its initialization, doesn't wait for any user input or logging to the local console if its coming back from a successful boot/power off cycle. But that's separate from the Windows feature which is related to hibernation. You could have Windows Fast Boot enabled while having your UEFI set to a full boot cycle or vice versa.

The UEFI option should work fine with pretty much any OS the hardware should be able to boot.

That particular feature is a Windows feature, though there's no reason you theoretically couldn't do something similar on another OS.
Sounds like resuming from a snapshot.
Exactly, you just need to separate things that need to happen before snapshot from things that must be run every time. Then you just snapshot as part of the build.

This is how GraalVM and OpenJ9 achieve instant startup of Java programs.

Or Smalltalk and Common Lisp images.
Also sounds like unikernels.
Wouldn't that mess with RNGs?
maybe one could use hardware RNGs? not sure if OSes can be made to not cache/buffer entropy, or be forced to reset RNGs?

EDIT: at least linux seems to be capable of doing that: https://www.systutorials.com/docs/linux/man/9-crypto_rng_res...

How does this compare to other BSDs?

Wasn't there recently a lot of work to reduce the time on FreeBSD (and it's 20ms)

https://www.usenix.org/publications/loginonline/freebsd-fire....

"I think the fastest I got the FreeBSD kernel booting in Firecracker was 21 ms. NetBSD is now at 18 ms... I need to go back and address some more of the issues I noticed but didn't get around to fixing.

Anyone know what the current record for Linux is? Last I heard was ~50 ms."

- @cperciva at https://twitter.com/cperciva/status/1747270461095043532

> “I need to go back and address some more of the issues I noticed but didn't get around to fixing.”

I wonder if this is related to what Netflix found as a regression.

Starting at slide #18 below

https://people.freebsd.org/~gallatin/talks/OpenFest2023.pdf

The bug Netflix tripped over was something I introduced while shaving off milliseconds, yes.

The "other issues I didn't get around to fixing" are things like precomputing lookup tables (we can wait and do them on demand, or not at all if they turn out to never get used) and an O(n^2) issue registering names of sysctls.

This work is based on the FreeBSD work.
That's nice, personally it takes me about 400 ms to start a linux kernel with qemu (more time is spent by qemu initializing its network interfaces that starting the kernel and the init!)

I'll see which tricks can be reused on linux, as I'd love to cut that by 1 order of magnitude!

What's the firecracker command for this? (and also for Linux)
I am assuming your asking about how to boot up the image. You can poke around here:

https://dev.l1x.be/posts/2020/12/13/diving-into-firecracker-...

If you are asking for the NetBSD image I am not sure.

I like this resource for starting a VM from a container image: https://github.com/alexellis/firecracker-init-lab
[in mice]
With 1 CPU and 128 MB of RAM. Sure, it's a small system, but it's enough to be useful for some purposes -- more useful than medical discoveries in mice, at least.
And in a VM, not bare metal.

"Boot" isn't even a sensible concept in a VM, IMHO, you could just toss an image into RAM. There's no hardware to initialize, the caches are probably already warm, etc.

You do need to shake hands with the virtio devices; that's a bit easier than most hardware, but it's not trivial.

The "take a snapshot after you finish booting and resume that" approach can work (Lambda does it) but it only works if you have the same type and number of CPUs, the same amount of RAM, the same filesystem on disk, and even the same MAC addresses on your network interfaces. So it's not like FreeBSD can ship useful "pre-booted" images.

Very interesting