|
For languages like C, C++, and Rust, the bottleneck is going to mainly be system calls. With a big buffer, on an old machine, I get about 1.5 GiB/s with C++. Writing 1 char at a time, I get less than 1 MiB/s. $ ./a.out 1000000 2000 | cat >/dev/null
buffer size: 1000000, num syscalls: 2000, perf:1578.779593 MiB/s
$ ./a.out 1 2000000 | cat >/dev/null
buffer size: 1, num syscalls: 2000000, perf:0.832587 MiB/s
Code is: #include <cstddef>
#include <random>
#include <chrono>
#include <cassert>
#include <array>
#include <cstdio>
#include <unistd.h>
#include <cstring>
#include <cstdlib>
int main(int argc, char **argv) {
int rv;
assert(argc == 3);
const unsigned int n = std::atoi(argv[1]);
char *buf = new char[n];
std::memset(buf, '1', n);
const unsigned int k = std::atoi(argv[2]);
auto start = std::chrono::high_resolution_clock::now();
for (size_t i = 0; i < k; i++) {
rv = write(1, buf, n);
assert(rv == int(n));
}
auto stop = std::chrono::high_resolution_clock::now();
auto duration = stop - start;
std::chrono::duration<double> secs = duration;
std::fprintf(stderr, "buffer size: %d, num syscalls: %d, perf:%f MiB/s\n", n, k, (double(n)*k)/(1024*1024)/secs.count());
}
EDIT: Also note that a big write to a pipe (bigger than PIPE_BUF) may require multiple syscalls on the read side.EDIT 2: Also, it appears that the kernel is smart enough to not copy anything when it's clear that there is no need. When I don't go through cat, I get rates that are well above memory bandwidth, implying that it's not doing any actual work: $ ./a.out 1000000 1000 >/dev/null
buffer size: 1000000, num syscalls: 1000, perf: 1827368.373827 MiB/s
|
I may be wrong, though. Check with lsof or similar.