Hacker News new | ask | show | jobs
by paulfurtado 1571 days ago
Slightly pedantic: ec2 doesn't actually support nested virtualization on any instance type I know of, but does have baremetal instance types that support virtualization.

The reason I mention this is because, sadly, baremetal instance types are only ever the largest size of a given family which is cost prohibitive for most users. And even if cost isn't an issue, they take much much longer to start (like 10-20+ minutes) and they actually fail to start far too frequently. It's really a shame that all instance types other than baremetal have virtualization extensions disabled, otherwise we'd be operating far more workloads in firecracker or kata. We operate huge kubernetes clusters so the cost is roughly the same whether it's fewer big instances or more smaller instances, but those startup times and reliability are terrible for autoscaling.

Please, AWS, bring nested virtualization to all nitro instance types!

1 comments

You can run https://gvisor.dev/ without any virtualization requirement. We use this to host user-submitted configurations (not arbitrary code, but arbitrary input to ~mostly trusted code).

Does this not meet your requirements?

gvisor is awesome and works for particularly untrusted applications, but it's not a performance hit we'd be willing to take across the board and effectively only protects you from security bugs rather than other kernel issues. We run thousands of production database workloads, hundreds of load balancers, thousands web apps, ML jobs, batch processing, etc in kubernetes, most of which require as much performance as possible.

When an EBS volume for a pod goes impaired, if it's using xfs you can basically count the whole server as dead no matter how many xfs + block io timeouts you set. xfs will stop being able to mount/unmount any other filesystems once hung in an unmount call for one. With a proper VM, you'd passthrough the nvme device with pcie passthrough and the host would be totally unimpacted.

Also, gvisor's better mode requires kvm, but it's cool that it effectively functions with ptrace when you can't use kvm.