Hacker News new | ask | show | jobs
by moonchrome 1387 days ago
My main problem with Entity Framework is the magic underneath.

Like simple operation

    x = Ef.Find(xid)
    x.Name = "something"
    y = Ef.Find(xid)

what is y.Name ? Even though you didn't save anything to the database yet ? And the second Find didn't actually refresh from the database ?

Oh and the random bugs where people improperly include related entities but it somehow ends up working because they are automatically added as you're firing off other related queries, until eventually it does not (usually in production only).

It's a really really complex system designed to look simple and pave over important details with "works most of the time" defaults.

4 comments

Once you move beyond trivial cases you really need to spend time understanding the principles behind the ORM you're using. They are always a very leaky abstraction, there is not really a way around that.

In this case the important part to know is that the DbContext represents the unit of work and "knows" Entities you previously queried on it. That's very useful, but also can hide bugs like you mentioned with the Includes. I do wish that you'd get more obvious errors if you forget an include, this can be really annoying to debug especially if you're new to EF Core. For read queries I mostly use Select instead of Include, which I find easier and more straightforward in most cases.

ORMs are really useful for making very common operations easy and for making stuff composable. They're also very complex and to make the best use of them you do need to understand both SQL and some basics on how your specific ORM generates this SQL.

>They're also very complex and to make the best use of them you do need to understand both SQL and some basics on how your specific ORM generates this SQL.

I think the biggest pitfall is how it maps object model to SQL.

The thing people fear about SQL query generation - IMO it's a non issue - when you identify hotspots you write your query manually, tools for that are there, it's easy to do retroactively and >90% of the code won't be the critical path.

DbContext is basically shared mutable state between your entire execution scope, and worst of all it makes it non-obvious.

> Once you move beyond trivial cases you really need to spend time understanding the principles behind the ORM you're using. They are always a very leaky abstraction, there is not really a way around that.

This is why I avoid ORMs in favor of writing SQL queries manually: I only need to understand one complex system for non-trivial cases instead of two.

(To be fair, I haven’t done any database programming for a few years. ORMs may have significantly improved since I last looked at them.)

They are always a very leaky abstraction, there is not really a way around that.

Editing in general is hard. E.g. if in a form you change a field that participates in some filter which generates a dataset to be used in that form, it creates an issue that a naive join now returns incorrect data (because the join condition itself was edited). Complex ORMs which help with that^ are not leaky abstractions, they just try to avoid mistakes an average programmer would do anyway in “trivial” SQL tasks without blinking once.

And yes, gp question about x vs y means that no thought of editing contexts was ever considered. Plain old fetch-store is too low-level and doesn’t represent a model that business logic thinks in.

^ Idk about EF in particular, just assuming

This is why I like TypeORM. There is no magic, and every command maps 1:1 with a database operation.

No weird caching, no auto saves. Just an object mapper that you can use when you want and ignore when you need to.

This example is incomplete. We need to see what the enclosing transaction scope looks like.
That's sort of my point - when you see a random dbcontext read inside a function you have no idea what the fetch will actually do. It might just return an object that was already fetched elsewhere in the context and modified but not saved. It might return the first value. It will automatically plug related entities into navigation collections - even if they are queried completely independently.
I concur with these. I see colleagues who use EF as 'black magic', who postpone or fear looking into what is happening under the hood. Because they lack insight into what it really does, they regularly cause horrendous queries to happen. My pet peeve with EF LINQ is, that your c# is not really c#, so you may write queries that compile silently, but fail to execute on runtime because the C# cannot be translated to SQL.
Funnily, one of my pet peeves is people worrying about the SQL Generated from EF Linq. If you care that much about it I think you should just be writing the SQL by hand.
I might even go a step farther - you shouldn't care. It's the equivalent of caring whether or not your generated HTML or compiled IL or Assembly "looks nice."

If the SQL is performant and it returns the expected data, that is good enough for 99.9% of cases.

There are cases where you have to care, e.g. the difference between AsSingleQuery() and AsSplitQuery() in EF Core. This option only affects what kind of queries EF Core will create to perform the same job, but it can have pretty significant implications on performance in some cases, and it can affect the consistency guarantees you get for your results.
The abstraction is great, until it breaks.

In this case, with ORMS, even good ones, this happens often enough in production that to actually master the tool you do need to care.