Hacker News new | ask | show | jobs
by anlsh 90 days ago
Oh neat, a post I actually know something about! I worked a lot on userfaultfd performance for GCE's live migration post-copy a couple years ago. Or more specifically, I worked on mechanisms to avoid it entirely- due to lock contention in the kennel, faults become veeeerry slow as the number of vcpus scales, and as it happens VMs these days can have a lot of vcpus
1 comments

that's very interesting! I was noticing page vault storm on live migrations as well and I wonder if that's what you were running into / mentioning here regarding the lock contention