Hacker News new | ask | show | jobs
by AaronFriel 2757 days ago
GraphQL is a curious beast. Developed by Facebook as part of one of its many attempts to solve a fundamental problem for engineers: how to loosely couple the frontend and backend, while maximizing query performance and allowing flexibility in the frontend.

GraphQL does this, yes, but it's not particularly _smart_ about how caching works or how to avoid the Select N+1 problem. Their solution* is the blunt hammer that is Facebook's dataloader project which is basically: aggressively cache data model, pretend databases and joins and SQL doesn't exist, throw away any hope for ACID/consistency. Dataloader for example exposes all sorts of new and exciting types of inconsistency. This is hand-waved away because, I guess, consistency is boring and user expectations are low or irrelevant. (A comment with a missing edge to a post is invisible, a post with a missing edge to a comment has 0 comments. It'll all work itself out in the end.)

Curiously, Facebook went a long way down the road to fixing _this exact problem_ on the backend with a library called Haxl, written in Haskell. Haxl allows expressing relations between multiple data stores in a way that _looks_ like using an ORM, but under the hood creates a query and obviates the Select N+1 problem: a function which appears to select a post and for each comment retrieve an edge to the person who posted it will perform a single SELECT against the database, maintaining consistency with that store. There's no fundamental reason that couldn't be written in most dynamically typed languages or ORMs (though Haskell provides some really nice type level guarantees).

What's bizarre to me is that the former took off, and the latter is largely unknown outside the Haskell community.

* - Other ORMs have recognized this, and there are efforts underway for GraphQL backends in Python (Graphene) and Ruby, at least, to solve this.

4 comments

I didn’t understand why you have to use caching with DataLoader. I am also not sure about what you mean by inconsistency. The n + 1 query problem is usually solved by making two consecutive queries to the database. One for the root element and one for the total list of all edges, thanks to DataLoader. What is the problem with that?
Because you are now relying on consistency to be orchestrated in both the database and the dataloader instance(s).

1. If the dataloader isn't the sole service with database connections, the cache will be invalid when other services interact with the database.

2. Even if the dataloader is the sole mechanism for accessing the database, you have to figure out how to scale that to multiple nodes and maintain cache coherency on each.

3. Even if you run just a single dataloader instance or figure out how to ensure cache coherency, that layer is still oblivious to triggers on the database and so you had better not use any advanced functionality there.

4. Even if you strip away all of the low level SQL features and treat the database as a dumb searchable KVS with a single dataloader instance (or cache coherent layer in front of it), then you are still performing two queries and mutations which occur in parallel with queries can result in non-repeatable reads or phantom reads because the default in many GraphQL packages is that each query processed with sub-queries runs with no transaction wrapped around it.

5. Even if you ensure that every GraphQL gets a unique transaction, that doesn't mean the DataLoader cache is _coherent_ with the database transactions, and I haven't seen any papers or effort to verify that, so there's no guarantee parallelization can't result in dirty reads.

6. Okay, so you have a single threaded, single instance dataloader instance with a mutex around a database connection that runs every GraphQL query's subqueries in a transaction...

This is all fine if you're dealing with, well, comments and posts or other trivium in which consistency isn't an issue. Which actually happens to be the type of problems many large successful companies have to deal with.

But if you are dealing with financial data, medical data, scheduling of resources, anything where the equivalent paradigm of "my friend posted but I don't see it yet, therefore I can't comment on her post" or "my post loaded but I don't see my friend's comment on it yet" is an issue makes it a minefield for consistency.

I still don't understand what you expect DataLoader or GraphQL to solve for you.

The list of problems you mentioned are not what DataLoader/GraphQL are trying to solve. I am even not sure if there is an individual library that can solve these problems. The solution to them are at architectural level and requires more discussions than the decision to use GraphQL/DataLoader or not.

If you look at the graphql doc from the official website, this is what you get:

"GraphQL is a query language for your API, and a server-side runtime for executing queries by using a type system you define for your data. GraphQL isn't tied to any specific database or storage engine and is instead backed by your existing code and data."

And as far as I'm concerned, this is exactly what I think graphql is and should be. It doesn't say anything about caching or solving the N+1 problem or ACID/consistency.

You say "GraphQL does this, yes, but it's not particularly _smart_ about how caching works or how to avoid the Select N+1 problem.". But the whole point of graphql is to just be a typed query layer and use whatever strategy makes the most sense for your application. I feel like it's like saying "I'm surprised that JSON took off even if it's not smart enough to do X"... where actually JSON took off _because_ it doesn't try to do all of those features.

Hasura GraphQL engine also is a Haskell based engine that solves the N+1 query problem by compiling the incoming request into a single SQL query: https://github.com/hasura/graphql-engine/tree/master/server

EDIT: /s/package/engine

Are you affiliated with Hasura? I'd love to chat. mayreply@[my username] dot com.
> This is hand-waved away because, I guess, consistency is boring and user expectations are low or irrelevant.

It's probably handwaved away because Facebook finds eventual consistency plus pubsub for updates to be good enough most of the time, and wants to shift the memory and CPU costs of calculating JOINs into the easily scalable GraphQL layer instead of the data store.