Hacker News new | ask | show | jobs
by BiteCode_dev 747 days ago
The problem is you delegate a lot of the query building to the client, hoping that it will not suddenly change your performance profile by being creative and that you will have not missed an obviously expensive use case coming.

That's a huge bet, especially given that GraphQL is expensive in the first place, and given that the more you grow the API in size, the less you can actually map the cartesian product of all request params.

3 comments

I'm not sure this is any more or less of a problem for REST APIs. What if your engineers change $client/$server and the new version makes really expensive queries? Well, ask them not to do that, then when some of them inevitably ignore you, start to review their code, terminate long-running queries, batch or pool fanouts so they don't take anything down, monitor new releases and roll back if anything breaks, etc.

If you're providing an external API like GitHub does, then that's a different story and I agree.

If you have separation between front and back end, then the back end team can elect to serve REST APIs which only permit selecting, filtering, grouping and pagination that they know they can support within defined latency bounds for a given traffic level.

Thing get more problematic when there's vertical ownership for a feature, where the UI needs just a few extra things and you end up with a REST response which is fatter and fatter, in the interest of avoiding round trips and client-side joins.

The problem with killing correct queries that take too long is that it shows up as intermittent failure that's dependent on server load and data cardinality. You might not find it in testing and it ships a bad experience to the customer before bugs are found. Whereas APIs which can't be so easily misused make it much harder to ship bugs.

> the back end team can select to serve REST APIs which only permit selecting, filtering, grouping and pagination that they know they can support within defined latency bounds for a given traffic level.

Why do you think that they can't do that with GraphQL? GraphQL isn't open ended. Its a highly restricted syntax for calling nested resources. If a resource is expensive simply don't nest it and make it a top level field and it is the same as REST?

Lots of nested resources are by default efficiently served by GraphQL because they are mostly returning single object foreign keys. Something that would take extra calls with REST.

GraphQL can have the same restrictions and performance guarantees as REST but the later is not necessarily true because in REST there is no standard way to define nested resource access.

I think the point here is, if you have to involve a backend team to add restrictions to the graphql endpoints and try to make good educated guesses where those might be, then the idea of frontend not needing backend engineers to query whatever they need becomes less of an advantage. So is the complexity of setting up graphql and then having your backend team try and make sure no frontend engineers can do terrible queries better for the software than custom rest APIs where needed and standard resource APIs everywhere else. Obviously it depends on the project. But I find the complexity of setting up and managing graphql often isn’t worth the pain, especially with schema first resource api designs and tooling like Google’s AIP linter.
> if you have to involve a backend team to add restrictions to the graphql endpoints and try to make good educated guesses where those might be, then the idea of frontend not needing backend engineers

No because if you dont do that you have to involve more engineers anyways to build the REST endpoints and keep modifying the rest endpoints.

GraphQL is also default restrictive (i.e. exposes nothing). You don't need to add engineers to make it restrictive.

In Startups typically:

  -> Frontend changes most frequently
  -> Backend "utility functions " changes less
  -> Data model changes the least
Without Graphql your "backend" ends up needing to have a lot of work and changes because it is constantly needing to be updated to the needs of the most frequent changes happening on the frontend.

With GraphQL the only time you need to update the backend is when those "utility" functions change (i.e. 3rd party api calls, etc) or the data model changes.

So you end up needing substantially less backend engineers.

But you actually don't need to keep modifying the REST endpoints for most projects, that's what everybody is saying.

The vast majority of projects don't gain anything from this flexibility, because you don't have suddenly a 1000 of farmvilles copy cat that need their own little queries. You just have bob that need an order by.

> With GraphQL the only time you need to update the backend is when those "utility" functions change (i.e. 3rd party api calls, etc) or the data model changes.

This is akin to saying that "directly exposing the database is easier, you only have to change things if the data changes".

And yes this is true, but when the data changes, or the environment changes, the paradigm falls apart a bit, no? Which is what the backend code was for, insulation from that.

> In Startups typically:

Yes, so for a short lived, non-scaled application its far easier to do it one way, and maybe that's fine for most small apps (that will never scale far). I suspect a lot of the push back comes from larger, less nimble, more backwards-compat focused organizations/apps.

There's a lot of relevant differences between REST & GraphQL. It is possible to construct a REST endpoint that simply can't do any of those things, and such construction is a mid-level developer task at best. For instance, pagination of "all posts ever" is not uncommon, and clients won't be shocked to deal with it. GraphQL is enough harder to characterize the performance of that it definitely qualifies as a change in quantity that is itself a change in quality. Hypothetically, both approaches are vulnerable to all the same issues, but GraphQL is far more vulnerable.
This is Wrong.

GraphQL only exposes what you ask it to. There are plenty of pagination plugins for GraphQL frameworks just as there are plugins to REST framework.

GraphQL can be restrictive as REST if you want it to be.

The point is GraphQL can be "as restrictive" as REST, but if you want to enable more efficient queries by knowing all the data that the frontend is requesting, you can. But the opposite isn't true of REST. With REST if you want more advanced functionality like that you have to define your own specification.

But then what's the point of using it if it's to get the limitation of REST?

You get something more complex, more expensive to maintain, consuming more resources, and configure it to basically be REST with extra steps.

> more complex, more expensive to maintain, consuming more resources,

Idk. Strawberry GQL and most GQL libraries are maybe equally as complex as the REST libraries for the same language. Strawberry and FastAPI I would say are equal in complexity and configuration.

It would be hard for me to say GQL is more expensive or consumes more resources. Opposite of the purpose and most uses of GQL.

In stawberry you make a method per field you want to retrieve, I would say it is indeed more complex and costly.
Sorry, what? The original suggestion was that a developer would change things and it would cause performance problems. That same developer can change either a REST system or a GraphQL system and introduce the same performance issues in the same way, probably by adding a horrible N+1 query, or unbounded parallelism, or unbounded anything else.

Yeah, the client can't change the query if you don't let it specify a query, this is trivially true, but the developer can go break an API endpoint with the exact same result while trying to achieve the exact same business outcome.

The much more constrained input of the REST query means that the effect of changes on the API are much more comprehensible. Performance testing a particular REST endpoint is generally practical, and if a dev doesn't do it, the responsibility is reasonably placed on them. GraphQL means that you may do something you think is completely innocent like changing some index but for some query you didn't anticipate it trashes the performance. The range of things the dev of a GraphQL endpoint must be keeping track of is much larger than a REST endpoint, arguably exponentially so (though generally with a low power factor in practice, the possible queries are still exponentially complicated), and taking on any form of exponential responsibility is generally something that you should do only as a last resort, even if you do think your powers will stay low.
Obviously depends on the API but a REST API that maps relatively cleanly to database queries is going to make it very clear on both the client and the server when it’s not scaling well.

If, at page load, I’m making 100 HTTP requests to fetch 100 assets then as a client side developer I’m going to know that’s bad practise and that we really ought to have some kind of multi-get endpoint. With GraphQL that gets muddy, from the client side I’m not really sure if what I’m writing is going to be a massive performance drag or not.

> What if your engineers change $client/$server and the new version makes really expensive queries?

Yes, so the cost benefit here is not in favor of GraphQL. If both technologies ultimately suffer from the same issues (what to do about unpredictable clients), but one is far more complex to implement and use (GraphQL), then there's a clear winner. Spoiler, its not GraphQL.

Page specific endpoints, I would argue, can do 99% of what GraphQL was trying to do. If you want to use it as some sort of template language for constructing page specific endpoints, that could be useful (the same way xml schema is useful for specifying complex xml documents).

But you can optimize a page specific endpoint, and do it with REST-style endpoint to boot.

Having a bunch of "simple" calls and optimizing for the "complex" ones that you need using metrics/analysis is what you should be doing, not creating a complex API that is far harder to break down into "simple" cases.

When you build a GraphQL server, you’re creating a system that outputs page-specific endpoints. They can be generated just-in-time (the default) or at build time (the general recommendation).

The engineering work involved shifts from building individual endpoints to building the endpoint factory. This shift may or may not be worth the trade off, but there are definite advantages, especially from the perspective of whomever is building the client. And once you factor in the ease at which you can introduce partial streaming with defer and streamable (granted they’re still WIP spec-wise), the experience can be pretty sublime.

https://graphql.org/blog/2020-12-08-defer-stream/

This? Yeah, that seems neat, for command/batch queuing.

I'd be curious how it compares to e.g. rest apis returning refs to e.g. webrtc streams or tcp/udp ones for non-browser. I presume the main advantage would be client side.

Even a SQL query can suffer the same fate. Ever tried writing a SQL query against a distributed database that isn’t optimized for that read path?

I think that’s what’s really pointing out the root cause issues here, it’s not purely GraphQL’s problem, it’s the problems inherent to distributed systems.

I haven't done much more than toy projects in GraphQL. Is there no way to limit the query complexity/cost? Such as a statement timeout in postgres?
Ah but that's the beauty of GraphQL, a query can actually fetch data from several systems: the db, the cache, the search engine, etc. It's all abstracted away.

But let's say you have a timeout, and they have a retry, then suddenly, your server is now spammed by all the clients retry again and again a query that worked a week ago, but today is too heavy because of a tiny change nobody noticed.

And even if it's not the case, you can now break the client at any time because they decide to use a feature you gave them, but that you actually can't deal with right now.

To be clear, the main thing that's abstracted away are server round-trips and client-side joins. REST APIs can fetch data from different systems too.
Sure but queries are crafted in the client, that may know nothing about this, while a rest api, the requests are restricted and the queries are more likely under control of the server, which mean the backend will likely decide what fetches what and when.

It takes a lot of work to actually ensure all possible combinations of graphql params hit exactly what you want in the backend, and it's easy to mess with it in the frontend.

I'm not that much into GraphQL but I vaguely remember libraries that provide some kind of atteibutes you apply to entities/loaders and then pre-execution an estimated cost is calculated (and aborted if over a specified threshold).
API Platform for PHP is one of those graphql implementations that has a query cost limiter built in (it's binary, it just rejects queries that go over your configured complexity threshold). Shopify's graphql api is even fancier, where every query costs X amount of a limited number of "credits". The structure of gql itself makes costs easier to estimate (you have as many joins as you have bracket pairs, more or less), and some servers can recognize a directive in the schema to declare the "real" cost.
That's sort of my expectation too -- it would be nuts to provide a user facing system without bounds of some sort.
There’s a free query depth. There’s ways to do query cost but federating that then becomes really annoying. Coming from the security side, there’s always a long curve of explaining and an even longer curve of mitigating.

I always am happy when I get an elegant query working. Often however I just find I wasted time looking for a clean 1 query solution when iteration by caller was the only solution.

When the client’s data requirements change, isn’t there always a risk that the data loading performance profile will change?

Surely that is always the case, if the client is composing multiple REST requests, or if there’s one RPC method per client page, or with any other conceivable data loading scheme.