Hacker News new | ask | show | jobs
by alex-mohr 2453 days ago
Generally seems like a great offering!

I see immutable, but also upgradable? Is that via in-place upgrades or do upgrades require a reboot?

Example: severe bug or vulnerability in kubelet or containerd/docker. Can I use the API to roll out a fix to existing nodes such that running workloads have no disruption?

2 comments

We are taking two approaches to this. The first is that you could roll out a replacement node and shutdown the old one. In bare metal scenarios this is much harder so we implemented in place upgrades, but they work very similar to creating a new node. Since Talos is immutable and runs from RAM, an in place upgrade consists of shutting down all services, and then wiping the disk and performing a fresh install. We then reboot the node and its as if you wiped the machine clean and installed the new version of Talos from the get go. This is all via the API by the way.
Wait, you store the local roots on disk? Why not nfs or something similar - especially if you run from ram anyway?

Also sounds like a missed opportunity for kexec and a pivot to new rootfs on a new ramdisk?

Ed: based on https://www.talos-systems.com/docs/guides/bare_metal/ i gather i misunderstood what was said here; its new config in pxe, shutdown and reboot? Which maybe could be shutdown and kexec.

The rootfs in stored in the booloader partition and in the initramfs. As for NFS, I can see us adding support for that, but the out of the box experience for Talos in any of clouds will be painful if we exclusively require NFS.

Since we adhere to the KSPP (https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Pr...) guidelines, kexec is not an option unfortunately. We thought about this early on, but opted to follow KSPP over using kexec.

The whole point of Kubernetes is that you don't think this way. Replacing a node is not an impactful event if you're using K8S correctly.
I agree with this to an extent. There are certainly places where replacing can be expensive. For example, bare metal, or if the machine contains a large amount of data and moving that data to a new node is time consuming.
Your storage should be separated from the worker nodes. Unless you have some hyper-converged setup, then you make the deliberate choice that your node became special. (sorry for using the term hyper-converged)
Until you have to deal with RWO PVCs and evicting a nose requires an expensive and slow disk detatch/attach operation