Hacker News new | ask | show | jobs
by vbezhenar 962 days ago
How often is it used? I write a lot of Java and I never used this feature. For me, it made streams implementation unbearably complex, to the point that I can't read its sources for a feature that I probably will never use.

I, personally, wish they never implement this parallel feature. For me JDK would be better without it.

3 comments

Starting to program with Java streams is weird because you are utilizing functional constructs in a language that historically had little notion of them. When I first started using them it felt worthless when I could achieve the same thing in classic OOP faster (and often run it faster too). But after a while you get a feel for the fluid style programming streams enable (and imo cleaner code). These days with ChatGPT, its probably a lot easier to get started.

With that said, you should almost always write a stream thinking only sequentially first, then identify steps which can benefit from .parallel() and only parallelize those steps. Its leveraging .parallel() efficiently that provides an advantage at run-time and why I tend to use it.

I guess I was unclear, but I wrote specifically about parallel streams. Of course I use ordinary streams on every day basis. But using parallel stream in a server application which processes dozens of other requests simultaneously and runs on a server with dozen of other applications (very typical use-case for Java) just makes very little sense, because CPUs are already loaded and it'll just result with more context switches. I could imagine use-case for that (very urgent request which must be completed at expense of other requests and includes heavy collection processing), but I've yet to encounter it.
Yea, I don't imagine native stream parallelism will help when CPUs are already loaded. Presumably you're using Spring or Rx, in which case you probably can leverage reactive streams and/or Managed Blockers, but thats really just taking advantage of async patterns and not necessarily parallelism. The only case I could envision having a concrete benefit is if you used your own fork-join pool leveraging virtual threads instead of the global fork-join pool to prevent platform threads from hogging CPU, and then used reactive streams that leveraged the virtual threads. Although this would (theoretically) raise responsiveness, it would almost certainly come at the cost of throughput. All that is to say, parallelism generally only provides as much value as you have idle CPU cores.
Indeed, and some things become really hard to write because the operations are required to cope with parallel execution that I never want. Anyone doing serious parallelism is going to reach for another library anyway (Rx etc). Making streams parallel was a massive mistake and has just lead to enormous accidental complexity. God how I wish they’d just added map/filter/reduce to the collection interfaces.
The Stream API is insanely useful with just serial execution alone. A nested for loop with random breaks (that over time will do some random side-effect here and there, making it completely unreadable mess) is much worse than the “pipeline-y” behavior of streams.
It is useful, but it also has weaknesses. For example, I’ve lost count of the number of times I’ve seen someone forget to close() an IO-based stream (eg File.lines). But probably for 99% of cases you could get away with slurping the entire file into memory and returning a list. The streams API is optimised for the 1% of cases that need the additional complexity, at the expense of worse ergonomics for the common case.

I get why it ended up that way. I think it would have been difficult at that time to get traction for adding FP features just as a convenience. So they needed to do all the parallel stuff to justify it. But, as I said, anyone I see doing serious parallelism in Java is not using the streams API.

I use parallelization all the time. It's easily added later thanks to this feature, unlike say rewriting a code base from a single threaded language because you thought async constructs would be enough. Never making that mistake again.