Hacker News new | ask | show | jobs
by rbanffy 5879 days ago
I find it interesting the developer has to worry about the parallelization. Shouldn't the data persistence layer (be it a relational database or some other data engine) deal with that (and use parallelization whenever possible)?
1 comments

Ideally, the tools should figure all this out. In practice, figuring out if there is enough work for parallelism to save you time is a hard problem, and programmer annotations help a lot. Additionally, PLINQ (parallel LINQ) in different than single-threaded LINQ in ways that can affect your program. For instance, it may return the results of a query in a different order than they were inserted into the collection. Some programs rely on the ordering in LINQ, so PLINQ is opt-in, rather than opt-out or tools-decide, at this point. There is more detail on this issue and some others here ( http://msdn.microsoft.com/en-us/magazine/cc163329.aspx ).
Again, ideally, this decision should be passed down to the underlying datastore. It's only at that level that you have the reliable information on how the data is distributed among cores.

On what kinds of back-end do LINQ and PLINQ support parallelization?

I'm not understanding what underlying datastore you're referring to in the case of LINQ-to-Objects or LINQ-to-XML. These frameworks perform "queries" on ordinary objects or XML data stored in memory within the .Net process, not on some other machine as with LINQ-to-SQL. You use LINQ-to-Objects in place of a filtering foreach loop, and LINQ-to-XML in place of XPath et al.

PLINQ is just a parallel implementation of LINQ-to-Objects and LINQ-to-XML. It works in the same scenarios .Net programs work in, i.e. multicores or SMPs running Windows (not sure if PLINQ works with Mono).