In a lot of ways, having this be based on k8s provides a lot of flexibility and independence, and with k8s there's much less friction to providing computes with high locality relative to applications/users application code.
It's also the case that by staying with k8s we can take advantage of existing operational tooling, experience, and work, and can focus or development time on the important parts of this problem: runtime scaling, scheduling, and virtual machine management and not on cloud provider APIs and management.
In short, k8s gives us options that we like for the future, it's shortening the development cycle, and only getting in our way a below-average amount. At the same time--for the most part--we're building this with reasonable abstractions that would let us reuse our existing work if k8s becomes more trouble than it's worth.
Firecracker doesn’t support live migrations. There is a new project called cloud hypervisor and it showed a lot of promise, but we struggled to make it works and reverted to QEMU
As for k8s its an ongoing debate internally if the complexity worth the benefit. It helps us provision nodes but we have to fight it quite a bit too. It’s unclear we will keep it long term
It's also the case that by staying with k8s we can take advantage of existing operational tooling, experience, and work, and can focus or development time on the important parts of this problem: runtime scaling, scheduling, and virtual machine management and not on cloud provider APIs and management.
In short, k8s gives us options that we like for the future, it's shortening the development cycle, and only getting in our way a below-average amount. At the same time--for the most part--we're building this with reasonable abstractions that would let us reuse our existing work if k8s becomes more trouble than it's worth.