Hacker News new | ask | show | jobs
by dragontamer 1292 days ago
Does AMD support enhanced REP MOVS?

I know when I use GCC to compile with AVX512 flags, it seems to output memcpy as AVX registers / ZMMs and stuff...

Auto vectorization usually sucks for most code. But very simple setting of structures / memcpy / memset like code is ideal for AVX512. It's a pretty common use case (think a C++ vector<SomeClass> where the default constructor sets the 128 byte structure to some defaults)

1 comments

AVX512 doesn't itself imply Icelake+; the actual feature is FSRM (fast short rep movs), which is distinct from AVX512. In particular, Skylake Xeon and Cannon Lake, Cascade Lake, and Cooper Lake all have AVX512 but not FSRM, but my expectation is that all future architectures will have support, so I would expect memcpy and memset implementations tuned for Icelake and onwards to take advantage of it.