|
|
|
|
|
by jbellis
1547 days ago
|
|
I think the Streams designers did an amazing job. One of my favorite things about modern Java. Yes, it's more verbose than Python list comprehensions, but it's both higher-performance (parallelism that just works) and more productive (static typing means it almost always works as intended the first time). |
|
The work stealing mechanism used by streams doesn't really work. Frequently I've seen people get something like a 1.7x speedup on an 8 core machine and was able to get a 7.8x speedup on the same machine using a ThreadPoolExecutor.
For common "embarrassingly parallel" problems there are two parameters you need to set: (1) How many threads to use, and (2) How fine to subdivide the problem.
Often the basic work unit takes much less time to complete than the time it takes to switch between threads. For instance a raytracer can probably trace one ray in less than the time it takes to communicate between threads. If you try to parallelize a task with too fine a granularity you get a slowdown not a speedup. You might find you get a good speedup over a fairly wide range of granularities (you might do well with anywhere between 100 and 10,000 rays) but batching of some kind is essential.
As for the thread count it depends on if the job is CPU bound. A CPU bound job needs about as many threads as you have cores or SMT "threads". If the job is I/O bound you usually need many more threads to maximize performance, but it's tricky. A web crawler might be able to support 100's or 1000's of threads but if you point all those threads at one server you might crash it, get banned, or both.
If the awkward streams API bought you good performance and reliability (let's see... just about zero support for error handling) that would be one thing but it doesn't.
Static typing working so well is not a special feature of the streams API but rather one of the rather brilliant engineering that went into JDK 8. You can easily write your own "map()" functions and other higher-order functions that do many of the things the Stream API does.
It would really be nice to see a better third-party API.