|
|
|
|
|
by JoshTriplett
662 days ago
|
|
This is a side note to the main point being made, but on modern CPUs, "rep movsb" is just as fast as the fastest vectorized version, because the CPU knows to accelerate it. The name of the kernel function "copy_user_enhanced_fast_string" hints at this: the CPU features are ERMS ("Enhanced Repeat Move String", which makes "rep movsb" faster for anything above a certain length threshold) and FSRM ("Fast Short Repeat Move String", which makes "rep movsb" faster for shorter moves too). |
|
All thresholds are described in https://codebrowser.dev/glibc/glibc/sysdeps/x86_64/multiarch...
And they are not final, i. e. Noah Goldstein still updates them every year.