|
|
|
|
|
by gadmm
2170 days ago
|
|
Regarding latency, beyond the kind of benchmarks proposed in the paper (from which this performance claim comes), keep in mind that the behaviour is qualitatively different between the two alternatives. Systems programmers in OCaml can apply techniques to achieve a low latency, but these are unlikely to scale as well under ParMinor. Indeed, with the latter, it is not going to be possible to have a low-latency domain and a non-low-latency domain coexist: the whole program has to be written in the low-latency style. This is clear from the qualification “stop-the-world” for ParMinor, but worth to note for people who would read the claim out of context. There can be further qualitative differences in the GC design that are not measured by the benchmarks. |
|
To be clear, the said systems programs are not going to see slowdown when run in a sequential setting (which is what I expect the majority of use cases will be). It is unlikely that these systems programs will immediately want to take advantage of parallelism. Moreover, Multicore OCaml aims to add support for shared-memory parallel programming to increase throughput of the program. Getting high throughput and maintaining very low latency is a big challenge generally, and can't just be solved by the runtime system. It needs a different way of writing programs altogether.
> Indeed, with the latter, it is not going to be possible to have a low-latency domain and a non-low-latency domain coexist: the whole program has to be written in the low-latency style.
This is wrong. If the low-latency code is written as it is (no allocations), then in ParMinor the low-latency domain will be responsive at the cost of increasing the latency on non-low-latency domain. Of course, no GC safe points should be inserted in those low-latency code. This design is not ossified.
Another details to remember is that ParMinor is a stop-the-world parallel minor collector. The major collection is concurrent mark-and-sweep which keeps the latency low. I'd recommend checking out the new concurrent GC for GHC which uses concurrent collection for the major heap, and parallel collection for the minor heap [1], which is similar to ParMinor.
Ultimately, Multicore OCaml aims to offer an easy way of helping 95% of the programs to take advantage of parallel execution for increasing throughput without compromising latency. It is quite hard to fully support different expert cases especially since they rely on the details of the existing runtime system, which may no longer hold when the runtime system changes.
[1] https://dl.acm.org/doi/pdf/10.1145/3381898.3397214