Hacker News new | ask | show | jobs
by exogen 3159 days ago
> Combinatoric explosions of complexity via a single query path are not a feature of every API.

> Because RESTful APIs tend not to allow ad hoc graph traversal.

I think you're taking this graph part too literally. Almost every API has a "graph" of connected objects. GraphQL just makes it so that you can traverse them with a single query. REST endpoints tend to force you to make multiple queries to go back and fetch information about the entities whose IDs or URLs you received in earlier requests – thus the rate limiting. In both cases, combinoratic explosions (and infinite depth) are possible – REST just forces you to explode into more round-trips (and the server is likely doing even more duplicated work than it needs to to fulfill those subsequent requests).

If you wanted to simulate the ease-off aspect of REST requiring clients to return for multiple rate-limited round trips to get the query data they want, you could simply add a timeout in the nested object's GraphQL resolvers that perform self-rate-limiting. Same result but the clients don't need to know about it, they can just wait the same amount of time they'd have had to wait for all the data anyway.

2 comments

> GraphQL just makes it so that you can traverse them with a single query.

Yes. This is what I'm saying. GQL allows for a combinatoric explosion of potentially required queries (and in extreme cases, data providers) to fufill any request. And every GQL endpoint needs to be able to service all of them unless your request routing proxy can peek into body contents, which is more expensive than URL routing.

> REST endpoints tend to force you to make multiple queries to go back and fetch information about the entities whose IDs you received in earlier requests

A problem we can solve elegantly with HTTP/2 push using nearly identical underlying API servicing models. What's great about that approach is that it's totally transparent to the client; they just get better performance with less resources.

Instead, folks have decided to discard a lot of really positive aspects of the REST model to make a client-facing DSL realized in the server.

> In both cases, combinoratic explosions (and infinite depth) are possible

But in the classical rest case, the client is aware they're doing this, as well as the server. In the GraphQL case, we've obfuscated this and said, "We reserve the right to reject your quest for any reason, and we've also made it harder for us to service your query (unless we go back to mandating every valid query as in rest), and we've also made scaling harder because it's more difficult to factor endpoints into different scaling groups."

But hey, that DSL is great. It's like JSON without tall that predictability or syntactic validation.

I cannot see any positive outcomes to adopting graphql other than that, "Client-side developers love it". If ya'll love it so much, why not maintain it on your side via service-worker query interception?

I ask facetiously. The answer is: because that would be really hard, and we'd rather push it off to API endpoint devs. Devs who promptly put restrictions that basically render the best part of GraphQL (that it is a query language) impotent for performance reasons.

> I cannot see any positive outcomes to adopting graphql

How about one HTTP request often being faster than multiple requests? How about only retrieving the payload you requested rather than all the extra data the API developers decided to expose in the endpoint – bandwidth isn't free? How about not having to add new custom bespoke API endpoints because some new part of the website just needs a few little different pieces of data that would normally require several round trips, pretty please? These are just normal everyday issues that people put up with when using REST APIs.

> A problem we can solve elegantly with HTTP/2 push using nearly identical underlying API servicing models. What's great about that approach is that it's totally transparent to the client; they just get better performance with less resources.

Shouldn't you use push when you know the client will ask for the resources? How would you know whether the client will ask for certain connected objects in this case? Would you just always be pushing every connected object over the wire?

I don't really see why you're giving such special distinction to one HTTP request vs. multiple HTTP requests. That is an arbitrary distinction to make. You shouldn't be asking "how much strain can a client put on my server with a single request? but oh, they can make multiple requests…" but "how much strain is a client going to put on my server to get all the data they need, whether it happens across one request or multiple?"

> How about one HTTP request often being faster than multiple requests?

Http/2 push.

But also: would it actually be more dev hours to write out the custom RESTful queries? If you're whitelisting individual queries. What's the difference then? You've just got a more awkward, uncachable, less split-routable protocol for exactly the same data.

> Shouldn't you use push when you know the client will ask for the resources?

Yes. If I know they intend to join data in, I can push it. I can even do this somewhat speculatively based on statistical patterns in clients. I can tune those values based on real data which can be refined over time.

> Yes. If I know they intend to join data in, I can push it.

Right, so if you don't have the full "query", which you don't with multiple REST round-trips, then you won't push it...

> I can even do this somewhat speculatively based on statistical patterns in clients. I can tune those values based on real data which can be refined over time.

Cool, so guessing. That's exactly what I want my API's performance profile to be based on. Sounds like a lot of work man, why don't you just use GraphQL instead? ;)

> Right, so if you don't have the full "query", which you don't with multiple REST round-trips, then you won't push it... well.

Yes. But of course, GraphQL ad hoc extensions are discussing limiting this arbitrarily as well.

> Cool, so guessing. That's exactly what I want my API's performance profile to be based on.

No. For example, if I can say that a banking customer wants to see a second page of transactions 90% of the time, then I should push the next page every time. If I can say they want to see the third page of transactions 10% of the time, then it makes sense to defer the cost.

Once your build pipeline is setup, there's no development overhead to whitelisting queries.
I don't actually even see the real engineering challenge of whitisting queries, outside of documentation and communicating errors.

It just because a confusing and unhelpful protocol choice if you start whitelisting queries. Without the "query" part it's just a DSL for graph retreival.

Why not put that client side if it is so expressive.

GraphQL just makes it so that you can traverse them with a single query.

How does this relate to the "I" in SOLID?

Sorry what class hierarchy are we talking about here? This seems like an extremely contrived application of SOLID.

(1) In what sense is my client making GraphQL queries depending on or tied to the available schema fields it does not query? Where's the coupling?

(2) I have a REST API that doesn't use PUT, PATCH, or DELETE. So I guess every REST API is not SOLID either, because it's using a general-purpose transport with features I'm not using?

(3) Does anyone care? I'm trying to make the best end-user and development experiences. GraphQL allows that. It's an improvement.

SOLID is some principals for one type of software (object oriented). Not the principles for all types of software.

You're trying to support a popular DSL and paying extra performance, engineering and architectural costs.

Why not move the wonderful experience client side and use service workers?

People talk a lot about one-query turnaround time, but for websites this is not nearly the same concern or gain that it is for, say, a mobile API. I'm a bit out of the mobile game this year, but AFAICT they aren't rushing to embrace gql.

Please, tell me more about how you’ve profiled both the performance of my apps and my development productivity! Amazing of you to provide this for me.

Meanwhile, NYTimes:

> “Facebook developed it to provide a data source that can evolve without breaking existing code and to favor speed on low-powered and low-quality mobile devices.”

[1] https://open.nytimes.com/react-relay-and-graphql-under-the-h...

> Please, tell me more about how you’ve profiled both the performance of my apps and my development productivity! Amazing of you to provide this for me.

Of course I haven't, but you haven't even begun to address the architectural concerns I've raised. You keep dismissing them without direct comment, then demanding I go further and further into the specifics of your unique use case to be allowed even a high level opinion.

I'm not telling you what to do. But I keep waiting for someone to tell me one good reason why I want to go back to a monolithic API dispatch. Please, tell me. Why should _I_ do this? Why don't you also worry about the architectural regression in your API here?

> Facebook

Facebook has haskell developers who actually have tools to write performant GraphQL servers. One of my big debates is if I want to finalize & make public some of my work to efficiently dispatch graphQL queries.

Haxl makes this entirely possible and other people have done proofs of concept as well. If I see an amazing benefit to finding time in my schedule to do this work, I will.

Of course, most folks won't use it. But we would, and that might be enough. But my other alternative is to do the exact same kind of work in the browser under service-workers. I don't really have many arguments against this other than "well you end up making the same # of queries as before so there's only minimal full query improvement."

But if i can combine a client-side gql dispatcher with http/2 push work (which inherently using query caching), it seems better. Again, lots of work in an ecosystem I'm not terribly fond of, but I can do it.

So please consider telling me what the actual benefits are and stop appealing to "developer experience" (I don't dispute this, I just can't make a decision based on it exclusively) and "faster queries" (because I think you lose at least as much performance as you gain and your API becomes a lot less predictable).