Hacker News new | ask | show | jobs
by thanatos519 1664 days ago
The main point of HT is to reduce the cost of context switching by keeping twice the number of contexts close to the core. I would guess that parts of the process context like program counter, TLB, etc live inside the 'HT' and would have to be saved/restored every time the process moves between threads, even on the same core. Reserving both 'HT' on a core gets you cache locality, but isn't there a cost to moving the process back and forth, even if that data is in L1/L2?

(I'm looking at 'lstopo' from package 'hwloc', Linux on my Haswell Xeon: 10MB shared L3, 256KB L2, 32KB L1{d,i} per core)

Given my (educated) guess, I've told irqbalance to put interrupts only on 'thread 0' and then I schedule cpu-intensive tasks to 'thread 1' and schedule them very-not-nicely. Linux seems pretty good about keeping everything else on 'thread 0' when I have 'thread 1' busy so I don't do any further management.

I can have 4 cores 'thread 1' pegged at 100% with no impact on interactive or I/O performance.

1 comments

In the context of the article, if you are trying to keep foreign processes "off my cores" then you can't neglect to keep them off the adjacent hyperthreads, because those share some of the resources. If you have 8 threads on 4 cores then at least the way Linux counts them cores 0 and 4 are sharing some caches and all backend execution resources. So if you have isolated core 0 but not core 4 you might as well have not done anything at all.
This makes sense in general, because the caches are the most precious resource.

However, in my case the working set is small enough and the processes are top-priority so they probably stay in the L2 if not the L1. Also ... I want to keep using my desktop so I don't mind the intrusion of my interactive processes.

Hmm. Is there a way to check how much L1/L2/L3 a process is occupying?

> in my case the working set is small enough and the processes are top-priority so they probably stay in the L2 if not the L1.

Maybe! Maybe not. If it's top priority on core X but something else with a much better (or cache-unfriendly) dataset is on the hyperthread-sibling core then your high priority process can still have cache misses.

No, but it is possible on certain top-end Intel SKUs to partition the last-level caches such that they are effectively reserved to certain processes.
pqos?
Even RDT isn't going to give you insight into L1 occupancy.