Hacker News new | ask | show | jobs
by DanielHB 749 days ago
GraphQL is very good for places where frontend and backend developers are isolated from each other (separate teams). Or rather places where you have data-producers and data-consumers as separate teams. If you have a big enough org eventually there will be many of such teams, interdisciplinary teams are not feasible at scale for everything.

It allows teams to work with less communication overhead. It solves more of a human problems than a technical problems, if someone think there is no value in that and bare metal performance is paramount that person never worked in a big org.

> RPC and REST are just more straightforward to monitor, log, cache, authorize and debug.

In some ways yes, in others no. For example it can be near impossible to see if a deprecated field in a REST API is still being used and by which clients it is being used. With GraphQL this is fairly simple.

Unfortunately GraphQL way of working is very different from normal REST APIs and often requires more complex server-side caching. The N+1 problem needs to be figured out upfront for every data-storage system used in the backend.

4 comments

The problem is you delegate a lot of the query building to the client, hoping that it will not suddenly change your performance profile by being creative and that you will have not missed an obviously expensive use case coming.

That's a huge bet, especially given that GraphQL is expensive in the first place, and given that the more you grow the API in size, the less you can actually map the cartesian product of all request params.

I'm not sure this is any more or less of a problem for REST APIs. What if your engineers change $client/$server and the new version makes really expensive queries? Well, ask them not to do that, then when some of them inevitably ignore you, start to review their code, terminate long-running queries, batch or pool fanouts so they don't take anything down, monitor new releases and roll back if anything breaks, etc.

If you're providing an external API like GitHub does, then that's a different story and I agree.

If you have separation between front and back end, then the back end team can elect to serve REST APIs which only permit selecting, filtering, grouping and pagination that they know they can support within defined latency bounds for a given traffic level.

Thing get more problematic when there's vertical ownership for a feature, where the UI needs just a few extra things and you end up with a REST response which is fatter and fatter, in the interest of avoiding round trips and client-side joins.

The problem with killing correct queries that take too long is that it shows up as intermittent failure that's dependent on server load and data cardinality. You might not find it in testing and it ships a bad experience to the customer before bugs are found. Whereas APIs which can't be so easily misused make it much harder to ship bugs.

> the back end team can select to serve REST APIs which only permit selecting, filtering, grouping and pagination that they know they can support within defined latency bounds for a given traffic level.

Why do you think that they can't do that with GraphQL? GraphQL isn't open ended. Its a highly restricted syntax for calling nested resources. If a resource is expensive simply don't nest it and make it a top level field and it is the same as REST?

Lots of nested resources are by default efficiently served by GraphQL because they are mostly returning single object foreign keys. Something that would take extra calls with REST.

GraphQL can have the same restrictions and performance guarantees as REST but the later is not necessarily true because in REST there is no standard way to define nested resource access.

I think the point here is, if you have to involve a backend team to add restrictions to the graphql endpoints and try to make good educated guesses where those might be, then the idea of frontend not needing backend engineers to query whatever they need becomes less of an advantage. So is the complexity of setting up graphql and then having your backend team try and make sure no frontend engineers can do terrible queries better for the software than custom rest APIs where needed and standard resource APIs everywhere else. Obviously it depends on the project. But I find the complexity of setting up and managing graphql often isn’t worth the pain, especially with schema first resource api designs and tooling like Google’s AIP linter.
> if you have to involve a backend team to add restrictions to the graphql endpoints and try to make good educated guesses where those might be, then the idea of frontend not needing backend engineers

No because if you dont do that you have to involve more engineers anyways to build the REST endpoints and keep modifying the rest endpoints.

GraphQL is also default restrictive (i.e. exposes nothing). You don't need to add engineers to make it restrictive.

In Startups typically:

  -> Frontend changes most frequently
  -> Backend "utility functions " changes less
  -> Data model changes the least
Without Graphql your "backend" ends up needing to have a lot of work and changes because it is constantly needing to be updated to the needs of the most frequent changes happening on the frontend.

With GraphQL the only time you need to update the backend is when those "utility" functions change (i.e. 3rd party api calls, etc) or the data model changes.

So you end up needing substantially less backend engineers.

There's a lot of relevant differences between REST & GraphQL. It is possible to construct a REST endpoint that simply can't do any of those things, and such construction is a mid-level developer task at best. For instance, pagination of "all posts ever" is not uncommon, and clients won't be shocked to deal with it. GraphQL is enough harder to characterize the performance of that it definitely qualifies as a change in quantity that is itself a change in quality. Hypothetically, both approaches are vulnerable to all the same issues, but GraphQL is far more vulnerable.
This is Wrong.

GraphQL only exposes what you ask it to. There are plenty of pagination plugins for GraphQL frameworks just as there are plugins to REST framework.

GraphQL can be restrictive as REST if you want it to be.

The point is GraphQL can be "as restrictive" as REST, but if you want to enable more efficient queries by knowing all the data that the frontend is requesting, you can. But the opposite isn't true of REST. With REST if you want more advanced functionality like that you have to define your own specification.

But then what's the point of using it if it's to get the limitation of REST?

You get something more complex, more expensive to maintain, consuming more resources, and configure it to basically be REST with extra steps.

> more complex, more expensive to maintain, consuming more resources,

Idk. Strawberry GQL and most GQL libraries are maybe equally as complex as the REST libraries for the same language. Strawberry and FastAPI I would say are equal in complexity and configuration.

It would be hard for me to say GQL is more expensive or consumes more resources. Opposite of the purpose and most uses of GQL.

Sorry, what? The original suggestion was that a developer would change things and it would cause performance problems. That same developer can change either a REST system or a GraphQL system and introduce the same performance issues in the same way, probably by adding a horrible N+1 query, or unbounded parallelism, or unbounded anything else.

Yeah, the client can't change the query if you don't let it specify a query, this is trivially true, but the developer can go break an API endpoint with the exact same result while trying to achieve the exact same business outcome.

The much more constrained input of the REST query means that the effect of changes on the API are much more comprehensible. Performance testing a particular REST endpoint is generally practical, and if a dev doesn't do it, the responsibility is reasonably placed on them. GraphQL means that you may do something you think is completely innocent like changing some index but for some query you didn't anticipate it trashes the performance. The range of things the dev of a GraphQL endpoint must be keeping track of is much larger than a REST endpoint, arguably exponentially so (though generally with a low power factor in practice, the possible queries are still exponentially complicated), and taking on any form of exponential responsibility is generally something that you should do only as a last resort, even if you do think your powers will stay low.
Obviously depends on the API but a REST API that maps relatively cleanly to database queries is going to make it very clear on both the client and the server when it’s not scaling well.

If, at page load, I’m making 100 HTTP requests to fetch 100 assets then as a client side developer I’m going to know that’s bad practise and that we really ought to have some kind of multi-get endpoint. With GraphQL that gets muddy, from the client side I’m not really sure if what I’m writing is going to be a massive performance drag or not.

> What if your engineers change $client/$server and the new version makes really expensive queries?

Yes, so the cost benefit here is not in favor of GraphQL. If both technologies ultimately suffer from the same issues (what to do about unpredictable clients), but one is far more complex to implement and use (GraphQL), then there's a clear winner. Spoiler, its not GraphQL.

Page specific endpoints, I would argue, can do 99% of what GraphQL was trying to do. If you want to use it as some sort of template language for constructing page specific endpoints, that could be useful (the same way xml schema is useful for specifying complex xml documents).

But you can optimize a page specific endpoint, and do it with REST-style endpoint to boot.

Having a bunch of "simple" calls and optimizing for the "complex" ones that you need using metrics/analysis is what you should be doing, not creating a complex API that is far harder to break down into "simple" cases.

When you build a GraphQL server, you’re creating a system that outputs page-specific endpoints. They can be generated just-in-time (the default) or at build time (the general recommendation).

The engineering work involved shifts from building individual endpoints to building the endpoint factory. This shift may or may not be worth the trade off, but there are definite advantages, especially from the perspective of whomever is building the client. And once you factor in the ease at which you can introduce partial streaming with defer and streamable (granted they’re still WIP spec-wise), the experience can be pretty sublime.

https://graphql.org/blog/2020-12-08-defer-stream/

This? Yeah, that seems neat, for command/batch queuing.

I'd be curious how it compares to e.g. rest apis returning refs to e.g. webrtc streams or tcp/udp ones for non-browser. I presume the main advantage would be client side.

Even a SQL query can suffer the same fate. Ever tried writing a SQL query against a distributed database that isn’t optimized for that read path?

I think that’s what’s really pointing out the root cause issues here, it’s not purely GraphQL’s problem, it’s the problems inherent to distributed systems.

I haven't done much more than toy projects in GraphQL. Is there no way to limit the query complexity/cost? Such as a statement timeout in postgres?
Ah but that's the beauty of GraphQL, a query can actually fetch data from several systems: the db, the cache, the search engine, etc. It's all abstracted away.

But let's say you have a timeout, and they have a retry, then suddenly, your server is now spammed by all the clients retry again and again a query that worked a week ago, but today is too heavy because of a tiny change nobody noticed.

And even if it's not the case, you can now break the client at any time because they decide to use a feature you gave them, but that you actually can't deal with right now.

To be clear, the main thing that's abstracted away are server round-trips and client-side joins. REST APIs can fetch data from different systems too.
Sure but queries are crafted in the client, that may know nothing about this, while a rest api, the requests are restricted and the queries are more likely under control of the server, which mean the backend will likely decide what fetches what and when.

It takes a lot of work to actually ensure all possible combinations of graphql params hit exactly what you want in the backend, and it's easy to mess with it in the frontend.

I'm not that much into GraphQL but I vaguely remember libraries that provide some kind of atteibutes you apply to entities/loaders and then pre-execution an estimated cost is calculated (and aborted if over a specified threshold).
API Platform for PHP is one of those graphql implementations that has a query cost limiter built in (it's binary, it just rejects queries that go over your configured complexity threshold). Shopify's graphql api is even fancier, where every query costs X amount of a limited number of "credits". The structure of gql itself makes costs easier to estimate (you have as many joins as you have bracket pairs, more or less), and some servers can recognize a directive in the schema to declare the "real" cost.
That's sort of my expectation too -- it would be nuts to provide a user facing system without bounds of some sort.
There’s a free query depth. There’s ways to do query cost but federating that then becomes really annoying. Coming from the security side, there’s always a long curve of explaining and an even longer curve of mitigating.

I always am happy when I get an elegant query working. Often however I just find I wasted time looking for a clean 1 query solution when iteration by caller was the only solution.

When the client’s data requirements change, isn’t there always a risk that the data loading performance profile will change?

Surely that is always the case, if the client is composing multiple REST requests, or if there’s one RPC method per client page, or with any other conceivable data loading scheme.

Couldn't disagree more. GraphQL encourages tight-coupling--the Frontend is allowed to send any possible query to the Backend, and the Backend needs to accommodate all possible permutations indefinitely and with good performance. This leads to far more fingerpointing/inefficiency in the log run, despite whatever illusion of short-term expediency it creates.

It is far better for the Backend to provide Frontend a contract--can do it with OpenAPI/Swagger--here are the endpoints, here are the allowed parameters, here is the response you will get--and we will make sure this narrowly defined scope works 100% of the time!

> It is far better for the Backend to provide Frontend a contract

It sure is better for the backend team, but the client teams will need to have countless meetings begging to establish/change a contract and always being told it will come in the next sprint (or the one after, or in Q3).

> This leads to far more fingerpointing/inefficiency in the log run, despite whatever illusion of short-term expediency it creates.

It is true it can cause these kind of problems, but they take far, far, far less time than mundane contract agreement conversations. Although having catastrophic failures is usually pretty dire when they do happen, but there are a lot of ways of mitigating as well like good monitoring and staggered deployments.

It is a tradeoff to be sure, there is no silver bullet.

trying to solve org problems with tech just creates more problems, allthewhile not actually solving the original problem.
This is what I wanted to say too. If your backend team is incapable of rapidly adding new endpoints for you, they probably are going to create a crappy graphql experience and not solve those problems either. So many frontend engineers on here saying that graphql solves the problem they had with backend engineers not being responsive or slow, but that is an org problem, not a technology problem.
At TableCheck, our frontend engineers started raising backend PRs for simple API stuff. If you use a framework like Rails, once you have the initial API structure sketched out, 80% of enhancements can be done by backend novices.
Yup. And the solution to that org problem is for the front engineers to slow down, and help out the "backend" engineers. The complexity issues faced by the back-end are only getting worse with time, the proper solution is not adding more complexity to the situation, but paying down the technical debt in your organization.

If your front-end engineers end up twiddling their thumbs (no bugs/hotfixes), perhaps there is time (and manpower) to try to design and build a "new" system that can cater to the new(er) needs.

GraphQL is the quintessential move fast and break things technology, I have worked in orgs and know other people who have done so in other orgs where getting time from other teams is really painful. It is usually caused by extreme pressure to deliver things.

What ends up happening is the clients doing work arounds to backend problems which creates even more technical debt

I never understood why this was such a big deal... "Hey, we need an endpoint to fetch a list of widgets so we can display them." "Okay." Is that so difficult? Maybe the real problem lies in poor planning and poor communication.
Not to mention the fact that GraphQL allows anyone messing with your API to also execute any query they want. Then you start getting into query whitelisting which adds a lot of complexity.
Most REST APIs I've seen in the wild just send the entire object back by default because they have no idea which fields the client needs. Most decent REST API implementations do support a field selection syntax in the query string, but it's rare that they'll generate properly typed clients for it. And of course OpenAPI has no concept of field selection, so it won't help you there either.

With my single WordPress project I found that WP GraphQL ran circles around the built-in WP REST API because it didn't try to pull in the dozens of extra custom fields that I didn't need. Not like it's hard to outdo anything built-in to WP tho...

> the Frontend is allowed to send any possible query to the Backend

It's really not, it's not exposing your whole DB or allowing random SQL queries.

> It is far better for the Backend to provide the Frontend a contract

GraphQL does this - it's just called the GraphQL "schema". It's not your entire database schema.

A GraphQL schema is a contract though.

And the REST API can still get hammered by the client - they could do an N + 1 query on their side. With GraphQL at least you can optimize this without adding a new endpoint.

Yes, GraphQL is a "contract" in the sense that a blank check is also a "contract".
You can whitelist queries in most systems though. In development mode allow them to run whatever query, and then lock it in to the whitelist for production. If that type of control is necessary.
Can you explain what you mean by this? The GraphQL API you expose allows only a certain schema. Sure, callers can craft a request that is slow because it's asking for too much, but

- Each individual thing available in the request should be no less timely to handle than it would via any other api

- Combining too many things together in a single call isn't a failing of the GraphQL endpoint, it's a failing of the caller; the same way it would be if they made multiple REST calls

Do you have an example of a call to a GraphQL API that would be a problem, that wouldn't be using some other approach?

Then we just come back full round trip to REST where the backend clearly defines what is allowed and what is returned. So using GraphQL it is unnecessary complicated to safeguard against a caller querying for all of the data and then some. For example the caller queries nested structures ad infinitum possibly even triggering a recursive loop that wakes up somebody at 3am.
But GraphQL doesn't allow for infinitely nested queries; the query itself has to include as much depth as it wants in the response.

> Then we just come back full round trip to REST

Except that GraphQL allows the back end to define the full set of fields that are available, and the front end can ask for some subset of that. This allows for less load; both on the network and on what the back end needs to fetch data for.

From a technical perspective, GraphQL is (effectively) just a REST API that allows the front end to specify which data it wants back.

You are correct and I agree with you. GraphQL can be used effectively like this and I've seen one example where GraphQL is used like this. New endpoint can be defined very quickly and it is essentially like a REST API with the possibility of the client specifying what data it wants back (as you described).

The other extreme end example is to expose by default the entire data model (PostGraphile) and then getting lost in the customisation and authorisation.

The problem is that the client team can - without notice - change their query patterns in a way that creates excess load when deployed.

When you use the "REST" / JSON-over-HTTP pattern which was more common in 2010, changes in query patterns necessarily involve the backend team, which means they are aware of the change & have an opportunity to get ahead of any performance impact.

My blocker on ever using GraphQL is generally if you've got enough data to need GraphQL you're hitting a database of some kind... and I do not generally hand direct query access to any client, not even other projects within the same organization, because I've spent far too much time in my life debugging slow queries. If even the author of a system can be surprised by missing indices and other things that cause slow queries, both due to initial design and due to changes to how the database decides to do things as things scale up, how can I offload the responsibility of knowing what queries will and will not complete in a reasonable period of time to the client? They get the power to run anything they want and I get the responsibility of having to make sure it all performs and nobody on either side has a contract for what is what?

I've never gotten a good answer to that question, so I've never even considered GraphQL in such systems where it may have made sense.

I can see it in something big like Jira or GitHub to talk to itself, so the backend & frontend teams can use it to decouple a bit, and then if something goes wrong with the performance they can pick up the pieces together as still effectively one team. But if that crosses a team boundary the communication costs go much higher and I'd rather just go through the usual "let's add this to the API" discussions with a discrete ask rather than "the query we decided to run today is slow, but we may run anything else any time we feel like it and that has to be fast too".

It seems there is a recent trend of using adapters that expose data stores over graphql automatically, which is kind of scary.

The graphql usage I'm used to works more or less the same as REST. You control the schema and the implementation, you control exactly how much data store access is allowed, etc. It's just like REST except the schema syntax is different.

The main advantage of GraphQL IMO is the nice introspection tools that frontend devs can use, i.e. GraphiQL and run queries from that UI. It's like going shopping in a nice supermarket.

Not particularly scary. For something like Hasura, resolvers are opt-in, not opt-out. So that should alleviate some of your concerns off the bat.

For Postgraphile, it leans more heavily on the database, which I prefer. Set up some row-level access policies along with table-level grant/revoke, and security tends to bubble up. There's no getting past a UI or middleware bug to get the data when the database itself is denying access to records. Pretty simple to unit test, and much more resistant to data leakage when the one-off automation script doesn't know the rules.

I also love that the table and column comments bubble up automatically as GraphiQL resolver documentation.

Agreed about the introspection tools. I can send a GraphiQL URL to most junior devs with little to no SQL experience, and they'll get data to their UI with less drama than with Swagger interfaces IMO. (Though Swagger tends to be pretty easy too compared to the bad old days.)

> It seems there is a recent trend of using adapters that expose data stores over graphql automatically, which is kind of scary.

I think that's the part where I have a disconnect. To me, both REST and GraphQL likely need to hit the database to get their data, and I would be writing the code that does that. Having the front end's call directly translated to database queries seems insane. The same would be true if you wrote a REST API that hit the database directly and took table/field names from query parameters; only... we don't do that because it would be stupid.

> changes in query patterns necessarily involve the backend team,

How does this follow? A client team can decide to e.g. put up a cross-sell shelf on a low-traffic page by calling a REST endpoint with tons of details and you have the same problem. I don't see the difference in any of these discussions, the only thing different is the schema syntax (graphql vs. openapi)

GQL is far more flexible wrt what kind of queries it can do (and yes, it can be constrained, but this flexibility is the whole point). Which means that turning a cheap query into an expensive one accidentally is very easy.

A hand-coded REST endpoint will give you a bunch of predefined options for filtering, sorting etc, and the dev who implements it will generally assume that all of those can be used, and write the backing query (and create indices) accordingly.

> and yes, it can be constrained, but this flexibility is the whole point

Flexibility within the constraints of what the back end can safely support.

To me, the "whole point" of GraphQL is to be able to have the client ask for only the data they need, rather than have a REST API that returns _everything_ and letting the front end throw away what they don't need (which incurs more load).

If you can't support returning certain configurations of data, then... don't.

And the client team using a REST API can do the exact same thing, by making more calls. There's no real difference between "more calls" and "same amount of calls, but requests more data".

That being said, it's a lot easier to setup caching for REST calls.

> GraphQL is very good for places where frontend and backend developers are isolated from each other (separate teams)

What do you mean by "backend developer" ? The one who creates the GraphQL endpoints for UI to consume ?

> In some ways yes, in others no. For example it can be near impossible to see if a deprecated field in a REST API is still being used and by which clients it is being used. With GraphQL this is fairly simple.

You should log deprecation warnings.

But also if the client is composing urls/params manually then you are not doing REST, you are doing RPC.

Rest APIs should mainly use HATEOAS hyperlinks to obtain a resource. that is clients almost always call links you have provided in other reponses (starting from a main entrypoint)

REST is just a short name for RPC over JSON. Nobody does real Fielding's REST.
I disagree, REST is still meaningfull even in the usual loosened sense, it still means that you are working with some kind of actions on resources