| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jayd16 2032 days ago
	Its a little confusing because they're conflating the idea that you almost certainly read at least the entire word (and not a single byte) at a time with the other idea that you could fetch multiple words concurrently.

1 comments

duskwuff 2032 days ago

Any cached memory access is going to read in the entire cache line -- 64 bytes on x86, apparently 128 on M1. This is true across most architectures which use caches; it isn't specific to M1 or ARM.

link

kzrdude 2032 days ago

(As I learned from recent Rust concurrency changes) on newer Intel, it usually fetches two cache lines so effectively 128 bytes while AMD usually 64 bytes. That's the sizes they use for "cache line padded" values (I.e making sure to separate two atomics by the fetch size to avoid threads invalidating the cache back and forth too much).

link

alblue 2032 days ago

To be clear here, it fetches two cache lines but it doesn’t put the second in exclusive state until it’s written to; the unit of granularity is still 64b. In a scanning read mode you will see the benefit but you won’t see the contention on writes. (The contention will come from subsequent reads on that cache line though)

link

jayd16 2032 days ago

Yes almost certainly more than the word will be read but it varies by architecture. I would think almost by definition no less than a word can be read so I went with that in my explanation.

link