Hacker News new | ask | show | jobs
by koverstreet 666 days ago
I'm still waiting for rep movsb and rep stosb to be fast enough to delete my simple C loop versions, for short memcpys.
1 comments

It is likely that on recent CPUs they are always faster than C loop versions.

On my Zen 3 CPU, for lengths of 2 kB or smaller it is possible to copy faster than with "rep movsb", but by using SIMD instructions (or equivalently the builtin "memcpy" provided by most C compilers), not with a C loop (unless the compiler recognizes the C loop and replaces it with the builtin memcpy, which is what some compilers will do at high optimization levels).