| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by anarazel 406 days ago
	From what I've seen a surprisingly large part of the overhead is due to SMAP when doing larger reads from the page cache - i.e. if I boot with clearcpuid=smap (not for prod use!), larger reads go significantly faster. On both Intel and AMD CPUs interestingly. On Intel it's also not hard to simply reach the per-core memory bandwidth with modern storage HW. This matters most prominently for writes by the checkpointing process, which needs to compute data checksums given the current postgres implementation (if enabled). But even for reads it can be a bottleneck, e.g. when prewarming the buffer pool after a restart.

2 comments

derefr 406 days ago

> if I boot with clearcpuid=smap (not for prod use!), larger reads go significantly faster. On both Intel and AMD CPUs interestingly.

Is there a page anywhere that collects these sorts of "turn the whole hardware security layer off" switches that can be flipped to get better throughput out of modern x86 CPUs, when your system has no real attack surface to speak of (e.g. air-gapped single-tenant HPC)?

link

the8472 406 days ago

On the kernel side there's a boot parameter for all of them: mitigations=off Software that was compiled with additional fences may have to be recompiled to remove them.

https://www.kernel.org/doc/html/latest/admin-guide/kernel-pa...

link

starspangled 406 days ago

mitigations=off disables workarounds for bugs or "mis-features" in the CPU that could be exploited to bypass OS security measures.

smap is an OS security measure, and so does not get disabled by mitigations=off. smap can be pretty draining for certain IO performance though. IMO it should be more well-known or covered by a more obvious option.

Linux kernel developers are really bad at defining and naming options like this.

link

amluto 406 days ago

SMAP overhead should be roughly constant, and I’d be quite surprised if it’s noticeable for large reads. Small reads are a different story.

link

anarazel 406 days ago

It turns out to be the other way round, curiously. The bigger the reads (i.e. how much to read in one syscall) and the bigger the target area of the reads (how long before a target memory location is reused), the bigger the overhead of SMAP gets.

If interesting I can dig up the reproducer I had at some point.

link

amluto 406 days ago

That is definitely interesting.

link