Hacker News new | ask | show | jobs
by axod 5877 days ago
FWIW I find the LINQ far harder to read.

  var developerNames = employees.AsParallel().AsOrdered()
                              .Where(e => e.Role == Role.Developer)
                              .OrderBy(e => e.LastName)
                              .Select(e => e.FullName)
                              .ToArray();
That's just gross IMHO. AsParallel() yuck. AsOrdered() eugh why are these functions being used to set parameters.
4 comments

Many people find function chaining more intuitive to read, or enjoy it more after being exposed to it for a period of time. The reason that functions are used to set flags or parameters is one of the basises of functional programming, which is to remove side effects.

In the above example, the state of employees is the same after the execution of the statement as it was before, if you were to set the "Parallel" flag or "Ordered" flag before hand each as their own assignment statement, then you would have modified the initial object and created a side effect.

I am not challenging your opinion, but simply answering your question as to why.

I, for one, find that kind of function chaining somewhat unintuitive even while I understand the logic behind it. I except methods of an object to modify that object, not to return a modified version of the object.
I'm not sure if you find it easier to read, but this is analogous:

  var developerNames = from emp in employees.AsParallel().AsOrdered()
                       where emp.Role == Role.Developer
                       orderby emp.LastName
                       select emp.FullName;
Nope, if anything that's worse. But thanks, interesting to see another example.
I think it only looks gross because they're methods on an object, and you're used to that meaning "imperative mutable thing." Compare this:

    employees.AsParallel().AsOrdered()
to the more-traditional-C-style

    MaintainOrder(Parallelize(employees))
or the admittedly more attractive

    (maintain-order (parallelize employees))
and suddenly it seems perfectly sensible. I thought it was ugly first, too, but now I'm used to it.
How is

    (maintain-order (parallelize employees))
vs

    MaintainOrder(Parallelize(employees))
"admittedly more attractive"? I enjoy functional constructs, but now you're just talking syntax right?
I agree, this is a strange claim. On the other hand (as long as we are with Clojure), something like

  (->> employees parallelize maintain-order)
might be said to be more attractive because it looks more linear. The sequential order of application is thus maintained, and the parentheses which are perceived as additional levels of hierarchy are removed.

Of course, the dot-dot-dot style in Java/C#/etc is doing the same:

  employees.Parallelize().MaintainOrder() 
is also linear, the parentheses are only used to specify parameters.
I'm just being a smug weenie is all. Obviously there's not a real difference here. However, it's a better question as to whether this style or the "pipeline" style illustrated below and in the C# code is more pleasant.
I think one of the problems I have with it, is that AsParallel() doesn't seem to me like it should be a function. It should be an argument to a function. I dislike that.
Depends on perception. What about

    someFunction.Memoize();
That's perfectly natural, right? (I mean, you could claim that "is memoized or not" should be a "flag" on a function, but that's starting to sound a little silly to my ears.) It's exactly analogous.
Is it just the name? AsParallel versus ToParallel, for example?
I find it interesting the developer has to worry about the parallelization. Shouldn't the data persistence layer (be it a relational database or some other data engine) deal with that (and use parallelization whenever possible)?
Ideally, the tools should figure all this out. In practice, figuring out if there is enough work for parallelism to save you time is a hard problem, and programmer annotations help a lot. Additionally, PLINQ (parallel LINQ) in different than single-threaded LINQ in ways that can affect your program. For instance, it may return the results of a query in a different order than they were inserted into the collection. Some programs rely on the ordering in LINQ, so PLINQ is opt-in, rather than opt-out or tools-decide, at this point. There is more detail on this issue and some others here ( http://msdn.microsoft.com/en-us/magazine/cc163329.aspx ).
Again, ideally, this decision should be passed down to the underlying datastore. It's only at that level that you have the reliable information on how the data is distributed among cores.

On what kinds of back-end do LINQ and PLINQ support parallelization?

I'm not understanding what underlying datastore you're referring to in the case of LINQ-to-Objects or LINQ-to-XML. These frameworks perform "queries" on ordinary objects or XML data stored in memory within the .Net process, not on some other machine as with LINQ-to-SQL. You use LINQ-to-Objects in place of a filtering foreach loop, and LINQ-to-XML in place of XPath et al.

PLINQ is just a parallel implementation of LINQ-to-Objects and LINQ-to-XML. It works in the same scenarios .Net programs work in, i.e. multicores or SMPs running Windows (not sure if PLINQ works with Mono).