Hacker News new | ask | show | jobs
by jerf 749 days ago
My blocker on ever using GraphQL is generally if you've got enough data to need GraphQL you're hitting a database of some kind... and I do not generally hand direct query access to any client, not even other projects within the same organization, because I've spent far too much time in my life debugging slow queries. If even the author of a system can be surprised by missing indices and other things that cause slow queries, both due to initial design and due to changes to how the database decides to do things as things scale up, how can I offload the responsibility of knowing what queries will and will not complete in a reasonable period of time to the client? They get the power to run anything they want and I get the responsibility of having to make sure it all performs and nobody on either side has a contract for what is what?

I've never gotten a good answer to that question, so I've never even considered GraphQL in such systems where it may have made sense.

I can see it in something big like Jira or GitHub to talk to itself, so the backend & frontend teams can use it to decouple a bit, and then if something goes wrong with the performance they can pick up the pieces together as still effectively one team. But if that crosses a team boundary the communication costs go much higher and I'd rather just go through the usual "let's add this to the API" discussions with a discrete ask rather than "the query we decided to run today is slow, but we may run anything else any time we feel like it and that has to be fast too".

1 comments

It seems there is a recent trend of using adapters that expose data stores over graphql automatically, which is kind of scary.

The graphql usage I'm used to works more or less the same as REST. You control the schema and the implementation, you control exactly how much data store access is allowed, etc. It's just like REST except the schema syntax is different.

The main advantage of GraphQL IMO is the nice introspection tools that frontend devs can use, i.e. GraphiQL and run queries from that UI. It's like going shopping in a nice supermarket.

Not particularly scary. For something like Hasura, resolvers are opt-in, not opt-out. So that should alleviate some of your concerns off the bat.

For Postgraphile, it leans more heavily on the database, which I prefer. Set up some row-level access policies along with table-level grant/revoke, and security tends to bubble up. There's no getting past a UI or middleware bug to get the data when the database itself is denying access to records. Pretty simple to unit test, and much more resistant to data leakage when the one-off automation script doesn't know the rules.

I also love that the table and column comments bubble up automatically as GraphiQL resolver documentation.

Agreed about the introspection tools. I can send a GraphiQL URL to most junior devs with little to no SQL experience, and they'll get data to their UI with less drama than with Swagger interfaces IMO. (Though Swagger tends to be pretty easy too compared to the bad old days.)

But as people noted, it's not the "can this get the data" unit testing that's a problem here. It's the performance issues.

> I can send a GraphiQL URL to most junior devs with little to no SQL experience, and they'll get data to their UI with less drama

But that's like giving direct (read) database access to someone that was taught the syntax of SQL but not the performance implications of the different types of queries. Sure, they can get the data they want; but the production server may fall over when someone hits the front end in a new way. Which is, I think, what a lot of people are talking about when they talk about GraphQL having load issues based on the front end changing their call.

Hasura and Postgraphile are quite performant. No complaints there.

And you can both put queries on an allow list, control max query depth, and/or throttle on query cost.

> put queries on an allow list, control max query depth, and/or throttle on query cost.

All of which are features which give you some way to respond to the performance issues you avoid by planning your API up-front.

None of those are automatic. Any API you write still has to remember to put in LIMIT clauses, to avoid OFFSET. And then you have to actually write the API rather than have it generated for you.

There are no free lunches, especially with regard to security and sanity checks.

> It seems there is a recent trend of using adapters that expose data stores over graphql automatically, which is kind of scary.

I think that's the part where I have a disconnect. To me, both REST and GraphQL likely need to hit the database to get their data, and I would be writing the code that does that. Having the front end's call directly translated to database queries seems insane. The same would be true if you wrote a REST API that hit the database directly and took table/field names from query parameters; only... we don't do that because it would be stupid.