|
|
|
|
|
by joosters
3300 days ago
|
|
$ yes |pv > /dev/null
46.6GiB 0:00:05 [9.33GiB/s]
$ taskset 1 yes |taskset 1 pv > /dev/null
32.9GiB 0:00:05 [6.58GiB/s]
$ taskset 1 yes |taskset 2 pv > /dev/null
45.7GiB 0:00:05 [9.13GiB/s]
$ taskset 1 yes |taskset 4 pv > /dev/null
45.7GiB 0:00:05 [9.18GiB/s]
Very rough numbers - the 9.13/9.33 difference flip-flopped when I ran the commands again. Binding both processes to the same core is definitely a performance hit though. There might be some gain through a shared cache, but it's lost more through lack of parallelism.I tried 2/4 as not sure how 'real' cores vs 'hyperthread' cores are numbered. These numbers are from a i7-7700k. |
|
Assuming pv uses splice(), there is one only copy in the workload: copy_from_user() from fixed source buffer to some kernel allocated page, then those pages are spliced to /dev/null.
If the pages are not "recycled" (through LRU scheme for allocation), the destination changes every time and the L2 cache is constantly trashed.