Hacker News new | ask | show | jobs
by zozbot234 169 days ago
On quite a few recent chips (including, AIUI, Apple M series) you can only saturate memory bandwidth by resorting to the iGPU (which has access to unified memory), CPU cores on their own won't do it. It means that using the iGPU as a blitter for huge in-memory transfers and for all throughput-limited computation (including such things as parallel parsing or de/compression workloads) is now the technically advisable choice, provided that this can be arranged.

> If however you use a zero-copy format, the CPU can skip data that it doesn't care about, so you can 'exceed' the 6 GB/s limit.

Of course the "skipping" is by cachelines. A cacheline is effectively a self-contained block of data from a memory throughput perspective, once you've read any part of it the rest comes for free.