| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gatronicus 1736 days ago
	Except that for decades REP MOVS/STOS were avoided on x86 because they were much slower than hand written assembly. This only changed recently.

1 comments

userbinator 1736 days ago

That was really only in the 286-486 era. On the 8086 it was the fastest, and since the Pentium II, which introduced cacheline-sized moves, it's basically nearly the same as the huge unrolled SIMD implementations that are marginally faster in microbenchmarks.

Linus Torvalds has some good comments on that here: https://www.realworldtech.com/forum/?threadid=196054&curpost...

link

josefx 1736 days ago

Linus seems to consider rep mov still too slow for small copies:

https://www.realworldtech.com/forum/?threadid=196054&curpost...

It seems to me that rep move is so bad that you want to avoid it, but trying to write a fast generic memcpy results in so much bloat to handle edge cases that rep move remains competitive in the generic case.

link