| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by userbinator 1737 days ago
	That was really only in the 286-486 era. On the 8086 it was the fastest, and since the Pentium II, which introduced cacheline-sized moves, it's basically nearly the same as the huge unrolled SIMD implementations that are marginally faster in microbenchmarks. Linus Torvalds has some good comments on that here: https://www.realworldtech.com/forum/?threadid=196054&curpost...

1 comments

josefx 1736 days ago

Linus seems to consider rep mov still too slow for small copies:

https://www.realworldtech.com/forum/?threadid=196054&curpost...

It seems to me that rep move is so bad that you want to avoid it, but trying to write a fast generic memcpy results in so much bloat to handle edge cases that rep move remains competitive in the generic case.

link