|
Indeed -- in Go technically it's "concurrency" not "parallelism" (something the Go folks are careful to articulate) but with multiple processors available, in effect you get multicore processing so calling it "parallelism" isn't far off :). More specifically, Miller uses separate "goroutines" (more or less threads) -- 2 for input (one for raw byte stream ingest and one for forming records), 1 for each verb (sort then head would be two verbs), and 1 for output (records back to strings), all pipelined up. Some of the recent perf improvements came from splitting the record-reader into two concurrent goroutines like that. Some of the older perf improvements are from pipelining the verbs. But yes, if you have say 16 CPUs and you launch 10, or 20, or 30 Miller executables -- in the C impl each would soak a single CPU and anything beyond 16, the OS would have to multitask things. With Miller 6, basically the same kind of thing except each executable will be trying to use more than one CPU if it can. If it can't, the Go runtime and the OS will multitask. And both do tend to handle this stuff gracefully and without tuning on the part of the user. Also I'd point out that even with big files & deeper multi-verb processing chains, htop rarely shows over 250% CPU, maybe 350% for deeper chains -- the input and output processing typically take most of the time, and the verbs in between not as much. When we run out of memory, systems crash; when we run out of CPU, things just take a bit longer. I.e. the oversubscription is a real but non-fatal issue, for C or for Go ... |