Hacker News new | ask | show | jobs
by bch 1339 days ago
> A cache miss and going to RAM is usually fast enough that we as software engineers don't care about it, and in fact our programming language of choice may not even give us a way of telling the difference between a piece of data coming from a CPU register or L1 cache vs going to RAM, but that doesn't mean the blocking isn't happening.

Yes, this is the line being discussed, and I guess (historically) I’ve just considered “a cost” without dragging “blocking” into the equation. We know that not accessing memory is cheaper than accessing it, and we can tune (pack our structs, mind thrashing the cache), but calling that blocking is still new to me. I’ll have to consider what it means. And also, does it imply the existence of non-blocking memory (I don’t think DMA is typically in a developers toolkit, but…)?

2 comments

> And also, does it imply the existence of non-blocking memory

Prefetching instructions, to tell the processor to load before you use it!

The first google hit [0] even calls it non-blocking memory access!

In [1] you can see some of the available prefetching instructions, and in [2] some analysis on how they deal with TLB misses (another extremely expensive way memory access can be blocking short of a page fault).

Another thing not mentioned above is that accessing a page of newly allocated memory often causes a page fault, since allocation is often delayed until use of each page, for overcommitting behavior - same for writing to memory that is copy-on-write from a fork!

[0] https://www.sciencedirect.com/topics/computer-science/prefet....

[1] https://docs.oracle.com/cd/E36784_01/html/E36859/epmpw.html

[2] https://stackoverflow.com/a/52377359/435796

> And also, does it imply the existence of non-blocking memory

Yes actually! If you know your going to need a block of memory before you actually need it, you can put in a request to the memory controller before you need it, then proceed to do some other work and check back in when your ready for the data or when the memory controller signals you it's done. It's just that this kind of thing is usually the scope of compiler optimizations or hyper optimized software like Varnish cache rather than something your average web developer thinks about. It's again conceptually the same as an async network request, but you bother with one while considering the other just "a cost" because of the different timescales.

Is that the same thing as a prefetch?
Yep!