30 years ago we had blitter chips dedicated to moving memory around without using cpu, pretty sure most memory controllers would have something similar these days
memcpy is done by the CPU core, at least for x86. The IMCs don't process data.
The CPU core has more bandwidth than the IMC anyway, so there would be no speed-up from adding this complexity to the IMC (it would not only need to perform the operation, but it would also need a way to maintain cache coherence and communicate with the issuing CPU, none of which is a problem if you just do it in the core). It might not even save power.
The CPU core has more bandwidth than the IMC anyway, so there would be no speed-up from adding this complexity to the IMC (it would not only need to perform the operation, but it would also need a way to maintain cache coherence and communicate with the issuing CPU, none of which is a problem if you just do it in the core). It might not even save power.