Hacker News new | ask | show | jobs
by batou 3968 days ago
Not a great fan of LINQ if I'm honest. Sure it's terse and powerful but with it comes great responsibility and a relatively large pile of tasty landmines. Three regular problems I see:

FirstOrDefault being passed 2 rows = non-determinism. This one is always fun when your SQL server doesn't guarantee the order of rows returned so every time you hit .FirstOrDefault you get a different item. Never happens in dev due to the tiny datasets fitting in a single page.

There's also Single being passed 2 rows = crash. SingleOrDefault being passed >1 rows = crash.

And when it does crash it's entirely not helpful because the lambda block reports no state information in the stack. Inevitably this leads to adding precondition checks to avoid "Hey everyone, one of the 20 LINQ expressions in this method blew with more than one element in the sequence".

Then there's the fact that you don't know the size of the set of data nor really think about it. So if someone passes in a collection of 10 rows in dev and it works out that it's O(N^2) and has a random .ToList() in it then you're up shit creek in the memory and time departments when that 10000 items collection appears in production.

None of this unique to LINQ but it hides a lot of problems behind a wall of pain.

From extensive experience; there be dragons.

7 comments

With all respect, it doesn't sound like you've taken the time to understand LINQ:

- Single and SingleOrDefault crashing with >1 row is a feature, not a bug. It means that your expectation that your filter is unique was incorrect. In that scenario, a crash is what you want. It's like when we use asserts. Similarly ..

- First and FirstOrDefault are only really appropriate after an OrderBy, i.e. to get the top item of a particular ranking. Using them anywhere else is almost always incorrect, and in the very few cases where it isn't, needs a comment to explain why, or I'll bounce your code back to you in review :).

LINQ is very expressive (you can do a lot with a little bit of code), especially with the embedded (non-fluent) syntax. That means that, yes, you can write some non-optimal code without realizing it, because it's only a few lines. On the flip side, you can also refactor that code quickly into more optimal patterns. By comparison, old-school iterating through collections is often much more difficult to refactor.

I understand LINQ very well (right down to code generation and expression trees) but unfortunately I work with a lot of other people's code.

Perhaps that is the problem? :-)

More seriously, the thing has really leaky abstraction boundaries which is a main pain point. I have to write and manage a lot of boring enterprise code and creativity and expressiveness go a long way to introducing a lot of additional work.

LINQ is not where the leaky abstraction is, IEnumerable is. IEnumerable makes virtually no guarantees about anything, including order, finiteness, side effects and speed/efficiency.

This is actually one of the main reasons that the BCL team introduced the IReadOnlyXXX<> interfaces - over the years, many .NET developers trying to do the right thing and express intent in their code had taken to using IEnumerable<> in method signatures to signify that the method would not modify the list/collection... but an IEnumerable doesn't have to be a list/collection, it can be infinite! In trying to do the right thing and be expressive with intent, lots and lots of people were unwittingly losing the expression of another (very important) assumption.

Any time you are wrangling any kind of enumerable in code, you have to think about its guarantees and assumptions. LINQ's expressiveness might make it easier to forget this but it's still the rule - if I hand you a "malicious IEnumerable", it doesn't matter if you use LINQ or a couple of foreach loops, it's going to launch the missiles on every call to MoveNext() regardless.

If you are trying to express intent in your code, IEnumerable is almost never what you want to use.

How should First work? How is this different than manually doing "SELECT TOP 1" or using LIMIT, but not specifying an ORDER BY? What a strange complaint.

And Single is supposed to crash if there's more than one row. Again, it's directly like selecting "TOP 2" then doing a check to make sure there isn't a second row. Your complaint seems like complaining that using a "while" conditional instead of an "if" can lead to trouble.

Seriously I'd love to know what code you are writing where you weren't sure if you wanted First or Single, decided Single (thus enforcing that data integrity rule), then decided it's the library's fault for doing exactly what you asked.

> FirstOrDefault being passed 2 rows = non-determinism... There's also Single being passed 2 rows = crash. SingleOrDefault being passed >1 rows = crash.

Which is the point of the different API's, you get to pick the behavior you want. Don't want it to crash? pick `FirstOrDefault`. A formal API is better than manually codifying the elected behavior each time, making it tedious to infer the intent each time whilst being susceptible to human errors manually copy+pasting imperative code.

The problem is that it's a complex trade off whichever way you turn. Knowing which trade off to make is difficult for many people as it requires knowing all of the assumptions and conditions that are before and after the call as well as how the expression will react under all conditions. Add complex predicates to that,which the language encourages and it's pain.
Except your example isn't a complex trade off -- it's just how it works. FirstOrDefault() returns the first or a default. It's right in the name? Which is first? Well you better provide an order if you care. If you don't care, don't provide an order. There is nothing complex about that at all.

It almost seems like your looking to blame Linq for what is clearly poor database and project design. If you have poor design you're going to have problems no matter what.

Objects.Where(el => el.Id == myRequestedId).FirstOrDefault().Name; //Crashes when there is no result, but this has been handled by the null operator in c# 5.0 => Objects.Where(el => el.Id == myRequestedId).FirstOrDefault()?.Name; The issues was not Linq, it is/was c#

Single() is when there is absolute only 1 result. > 1 result is not expected ( i'll admit that i never use Single or SingleOrDefault() )

And using paging is the same thing in linq as it is in SQL. If you don't know how Linq works, you should learn it. What you should know is that .ToList() hits the database, so when you query all the objects, you get all the objects. Limit your query before the database query :)

Here are the results of page 4 with 10 items : Objects.Skip(10 x 3).Take(10).ToList(); //fetches 10 items = paging in the query

Here is how someone could write it without knowing Linq: Objects.ToList().Skip(10 x 3).Take(10);//fetches all the items and does the paging on the list.

As long as your DataType is an IQueryable<T>, it doesn't hits the database. When it's a List<T> (eg. when calling .ToList()) it's too late

We do no LINQ near the database so some of those assumptions aren't valid here. We use NH criteria for that.

The coalesce operator in C# 5 is welcome but then introduces another problem:

    var x = myObject?.Property1?.Property2?.Property3;
If x is null, why and where?

This is all in-memory manipulation of data. Mainly rules engine stuff for us.

But both you and I know that is bad design. The operator is a shortcut for simple scenarios, but in this case if a decision has to be made then you should be handling the null checks yourself and making those decisions explicitly.

A line like that is just lazy. I'm more keen to blame the developer and not the language. Striking nails with your fists as they say.

>There's also Single being passed 2 rows = crash. SingleOrDefault being passed >1 rows = crash.

It's not a crash, it's an exception. You can catch an exception. But, as others have said, you could instead call the function that doesn't throw an exception if having more than one result isn't an error.

> None of this unique to LINQ but it hides a lot of problems behind a wall of pain.

There is your problem and it isn't LINQ. You'd have the same problem with C# without it. LINQ is a good declarative tool when declarative programming makes sense : data pipelines. And LINQ is 'lazy'...

You should be glad that an imperative language succeeded in getting declarative features. And that makes C# better than the competition in the same space, even though I never use MS techs.

Man I wish Android had LINQ, and the Java language support it relies on in C#. This is at or very near the top of the list of Things That Would Have Made Android Better especially since non-game Android coding is so database-centric.
What you want exists, and it's called Xamarin.
True, but it means taking the rest of Xamarin, too. While Xamarin is brilliant, and by far the best cross-platform SDK I've seen for any set of targets, it's still riskier than native platform development for most developers. I'd use it for a client who has a C#-oriented in-house team and the ability and resources to commit to overcoming the vicissitudes of cross-platform development, but I wouldn't just to get LINQ.