Hacker News new | ask | show | jobs
by zozbot234 233 days ago
> You would need an hardware instruction that tells you if a load or store would fault.

You have MADV_FREE pages/ranges. They get cleared when purged, so reading zeros tells you that the load would have faulted and needs to be populated from storage.

1 comments

MADV_FREE is insufficient - userspace doesn’t get a signal from the OS to know when there’s system wide memory pressure and having userspace try to respond to such a signal would be counter productive and slow in a kernel operation that needs to be a fast path. It’s more that you want MADV (page cache) a memory range and then have some way to have a shared data structure where you are told if it’s still resident and can lock it from being paged out.
MADV_FREE is also extremely expensive. CPU vendors have finally simplified TLB shootdown in recent CPUs with both AMD and Intel now having instructions to broadcast TLB flushes in hardware, which gets rid of one of the worst sources of performance degradation in threaded multicore applications (oh the pain of IPIs mixed with TLB flushing!). However, it's still very expensive to walk page tables and free pages.

Hardware reference counting of memory allocations would be very interesting. It would be shockingly simple to implement compared to many other features hardware already has to tackle.

> MADV_FREE is also extremely expensive.

It's quite expensive to free pages under memory pressure (though it's not clear that there's any other choice to be made), but if the pages are never freed it should be cheap, AIUI.

> userspace doesn’t get a signal from the OS to know when there’s system wide memory pressure

Memory pressure indicators exist, https://docs.kernel.org/accounting/psi.html

> have some way to have a shared data structure where you are told if it’s still resident and can lock it from being paged out.

What's more efficient than fetching data and comparing it with zero? Any write within the range will then cancel the MADV_FREE property on the written-to page thus "locking" it again, and this is also very efficient.