Hacker News new | ask | show | jobs
by erikpukinskis 1950 days ago
I don’t really think of GraphQL as a data querying language like SQL.

I think of it as a domain querying language.

SQL is meant to allow you to write queries to your data model which are:

- arbitrary, and

- efficient

I don’t think of GraphQL that way. I think of it as the place where you encode your set of valid domain actions (i.e. not arbitrary). And I don’t think the consumers of the GraphQL API should think about efficiency. They should just specify what data they need and then the backend is responsible for figuring out how to query the data model efficiently.

In other words, I don’t really see any overlap between GraphQL and SQL in terms of the role they play in a stack.

One helpful thing this distinction allows, is type inference. You can trivially write a type generator that gives you the type signature of a GraphQL query in any language. This is precisely because of its limitations. That allows you to automate the validation of your frontend and backend speaking the same language.

You can’t easily infer the return types of arbitrary SQL queries. To me, that highlights the different purposes of the languages.

3 comments

> And I don’t think the consumers of the GraphQL API should think about efficiency. They should just specify what data they need and then the backend is responsible for figuring out how to query the data model efficiently.

That is the point of the SQL language. It's declarative. You define what you want your data to look like and the query planner handles the actual fetching of the information in the most efficient way it can. Obviously it's not perfect and you still have to have someone who knows what they're doing to define schemas that make sense and indexes where appropriate, but that is a separate job from defining what data is needed.

Nitpick: it isn’t SQL that lets you define what you want your data to look like, but DDL (https://en.wikipedia.org/wiki/Data_definition_language)

SQL (https://en.wikipedia.org/wiki/SQL) as the name implies, is just for querying.

I don’t see huge benefits for GraphQL. It’s a query language without a good query planner, so APIs will be limited in the kind of queries they support. In that sense, it’s similar to restricting callers to a fixed set of stored procedures instead of full SQL.

That’s a viable idea, but it can also easily be implemented in REST or using a JDBC connection.

IMO, the main benefit of GraphQL is that it allows the caller to specify what fields it wants to see returned. That means the implementation doesn’t have to send information the caller doesn’t need, decreasing bandwidth, and also doesn’t have to provide a zillion endpoints (give me only the address of user ‘foo’, give me address and telephone number of user ‘foo!’, give me address and birthday of user ‘foo’, etc.) that have to be maintained at the caller’s whim (“we also need the user’s hair color”). That, however, doesn’t need a full query language. REST could easily be extended to support it.

Yes, you could also allow fairly free form GraphQL queries, but if you do, you need a query planner (and, with it, the statistics and metadata that help the query planner perform), or end up with a rat’s nest of special cases that has to be updated for every new type of query.

The great thing about GraphQL is that it's declarative and you have to write your own actual logic to define how the fetching of information works! It only does half the work SQL does! Everyone should switch to GraphQL ASAP!
When your backend is sql how can you even do this efficiently? Databases require indexes. If you can query anything then there are performance bombs all over the place. It's different if you're querying elastic I guess.
> If you can query anything [...]

The answer is simple: You can't. GraphQL _in general_ doesn't allow arbitrary queries. It allows arbitrary output field selection. But the filters are very explicit. It's more "pre-aggregation of request waterfalls and masking of outputs" than "querying a database".

Doesn't stop people from exposing their SQL databases directly from GraphQL by generating a "free-for-all" schema. And when they do - yep, that's definitely a performance bomb and not a good use of GraphQL.

> The answer is simple: You can't. GraphQL _in general_ doesn't allow arbitrary queries.

It really does. Surely, it somewhat limits the data that you get from it by defining a schema. But the moment you allow any nesting/connections between data in that schema, hello n+1 problem.

And then every discussion of this problem on HN or elsewhere exposes the ugly truth: almost everyone uses GraphQL as a REST endpoint in production by limiting the actual queries you can run and curbing nesting.

The n+1 problem has solutions though. The most well-known solutions may not suit your architecture, but please can we stop pretending they don't exist?

GraphQL has been public since June 2015, and there's been at least one solution to the n+1 problem (Dataloader) since September 2015. If you were using pure REST endpoints (just resources, no nesting/traversal) this is the exact problem you'd be punting over to the client to solve -- all that GraphQL is doing here is moving it back onto the server. The actual amount of work is the same, you just get faster response times.

Most implementations of GraphQL I've seen in different languages provide some variation on the Dataloader pattern. I'll fully concede it can be a hassle to set it up correctly, but it works.

> The n+1 problem has solutions though.

It does. But it also means that the problem exists. It's there, you run into it by default, and you have to take special care to make sure it doesn't happen. And data-loaders are just a first step. Some systems try to actually calculate query complexities and nesting depth.

> this is the exact problem you'd be punting over to the client to solve -- all that GraphQL is doing here is moving it back onto the server. The actual amount of work is the same

Exactly. The complexity doesn't go anywhere.

I... don't know how all this is an argument against what I said.

> It does. But it also means that the problem exists.

The exact same problem exists on the client side with REST. I get what you're saying, but it's a lot easier to fix N+1 issues at the GraphQL resolver level, because once you've fixed it, you don't have to touch it again. With REST, you end up either creating ad hoc endpoints or changes to solve each individual problem in isolation, or you end up building a lot more flexibility into your REST API to solve it in a general way, in which case you've badly reinvented GraphQL without benefiting from the existing ecosystem.

The complexity goes to where the round-trips are shorter, and where the benefits can be shared between all clients regards of language; how is this not a good thing?
I never fully solved this problem, so don’t trust me, but I can tell you what I learned...

I think for one thing you can’t really rely on joins for query efficiency, because as you say there are too many combinations so it’s impossible to optimize everything.

Instead you have to try to query each data type separately. So you get a query for users. You do an SQL call and gather up a bunch of requests for offices, and then you do a single request to your office backend.

I think the best case is something like n SQL queries per request, where n is the depth of the tree you are querying (users->office->address is depth 3).

That means you’re doing all your queries after the first one by ID (not by arbitrary columns). So you have to have some way to “pre-join” your tables. You can do this either by optimistically joining your data to everything around it (query the node plus all of its edges) or you need to store your edges in your data model (which I have to assume is what FB does).

In the end your resolvers need to be using some standardized way of grabbing objects by is (or edge), something like https://github.com/graphql/dataloader

Whether it’s possible to do this efficiently I don’t know. At my last job we messed it up, and then we started applying a strategy like I described above, but then I switched jobs.

Would love to hear from others who have dealt with the same challenges.

So SQL is not a database :). It is a data access DSL that is implemented by databases. SQL being untyped I dont think is true - the table schemas are types (albeit basic product/record types). Inferring the type of a result is quite reasonable if you start with the schemas. SQl suffers from a UX problem for sure.
If we consider GraphQL to be a domain querying language, what have we gained over REST? You are free to model endpoints in REST according to your domain (and deal with complexity behind the facade), and I'd argue that REST can offer an even more ergonomic DSL interface if you just write whatever you want (s-expressions, let's say) and pass them to some POST endpoint that reads your DSL and parses it. If the idea of writing the DSL and POSTing it, parsing it, and doing the very specific logic you want sounds wrong to you, it seems like GraphQL should also similarly wrong. If it sounds good to you, then is GraphQL far enough?

> I don’t think of GraphQL that way. I think of it as the place where you encode your set of valid domain actions (i.e. not arbitrary). And I don’t think the consumers of the GraphQL API should think about efficiency. They should just specify what data they need and then the backend is responsible for figuring out how to query the data model efficiently.

This is how SQL works, so there is some overlap there. Optimizing SQL queries is might be a performance-seeking operation, but SQL is declarative, and it is left largely to the query optimizer to make your queries run fast. You can help the query optimizer make the query run fast, but that's all you can do -- and I can guarantee you that doing query optimization has not gone away due to GraphQL, you've just pushed the problem somewhere else, or you're forgetting the bits of your API that you've modeled awkwardly in order to avoid performance degradation/difficult-to-write resolvers.

But I think we're a bit off-track here -- GraphQL and REST is at a different level of abstraction than SQL. My point is that we've taken a step back from what we had already with REST rather than that people should be using SQL on the front-end. I think GraphQL is doomed to attempt to reach expressive parity with SQL but that's another conversation all-together.

> One helpful thing this distinction allows, is type inference. You can trivially write a type generator that gives you the type signature of a GraphQL query in any language. This is precisely because of its limitations. That allows you to automate the validation of your frontend and backend speaking the same language. > > You can’t easily infer the return types of arbitrary SQL queries. To me, that highlights the different purposes of the languages.

Most sufficiently ORMs can also give you this, and in other languages there are libraries that will compile-time-check the arbitrary SQL queries you write and won't compile if they're invalid. What you need to have that kind of thing work is sufficient type-checking power (Typescript offers this) and sufficiently rich metadata (there are some examples in the haskell[0] and rust[1] worlds). It wasn't necessary to throw away REST to get these kinds of benefits. I've been quite happy with TypeORM for example, and it would form a good base for this kind of effort -- I don't know a library that's already doing it, but this actually isn't as hard as you think, especially for the simple case.

I'd argue that there is no difference (without too much evidence, to be fair, as I am not an expert in inner working of GraphQL) in the difficulty or parsing and validating a GraphQL query for the simple case (i.e. the actual subset of SQL that GraphQL represents) than actual SQL.

[0]: https://hackage.haskell.org/package/postgresql-typed-0.6.1.2...

[1]: https://github.com/launchbadge/sqlx

> If we consider GraphQL to be a domain querying language, what have we gained over REST?

Part of the value is in standardization. Yes, you can get most of the benefits of GraphQL by creating your own layer over REST, but then you've just written a badly specified, bug-ridden version of GraphQL. The latter has enough momentum that there now exist tons of tools for working with it, which obviously wouldn't be true for anything you build yourself.

> Most sufficiently ORMs can also give you this, and in other languages there are libraries that will compile-time-check the arbitrary SQL queries you write and won't compile if they're invalid.

You're missing the point here: GraphQL gives you type safety between the server and client. This has nothing to do with ORMs or your database. What this means is that, when building your web (or Android, or iOS, or refrigerator, or whatever) frontend, you can guarantee the type of every part of your query before even executing it. This is a powerful guarantee, and paired with something like GraphiQL[0], it allows for a level of exploratory programming that isn't currently possible with REST.

[0]: https://graphql.org/swapi-graphql

> Part of the value is in standardization. Yes, you can get most of the benefits of GraphQL by creating your own layer over REST, but then you've just written a badly specified, bug-ridden version of GraphQL. The latter has enough momentum that there now exist tons of tools for working with it, which obviously wouldn't be true for anything you build yourself.

Right, and you could have done this standardization somewhere else right -- GraphQL is a NIH version of what could have existed on top of well-considered existing standards.

> You're missing the point here: GraphQL gives you type safety between the server and client. This has nothing to do with ORMs or your database. What this means is that, when building your web (or Android, or iOS, or refrigerator, or whatever) frontend, you can guarantee the type of every part of your query before even executing it. This is a powerful guarantee, and paired with something like GraphiQL[0], it allows for a level of exploratory programming that isn't currently possible with REST.

I was responding to the point the person made in particular, I'm aware of the mismatch in tiers, but their question was specifically about being able to infer types from SQL queries. They were comparing SQL to GraphQL. There is nothing stopping you from using compile-time-checking on the client side, if you use something like TypeScript, which is why the rest of the surrounding example was given.

This also is related to the database, because the original point was that people go GraphQL -> Resolver -> DB, rather than HTTP -> HTTP Endpoint -> DB, and because they have chosen GraphQL, they must now write efficient, general resolvers that are essentially hand-built query optimizers whereas with REST you can build with higher granularity and usually higher, easier-to-achieve efficiency.

You can guarantee every part of your query with REST as well, if what you mean is you can avoid writing an invalid query -- Typescript + generated OpenAPI client libraries do this very well -- it's not unique to GraphQL.

> Right, and you could have done this standardization somewhere else right -- GraphQL is a NIH version of what could have existed on top of well-considered existing standards.

Maybe, but I think building GraphQL on top of REST would have produced something much more convoluted and verbose. As an example, see the OData standard[0], which, at least from the client's perspective, is unnecessarily complex compared to GraphQL. There may be cleaner ways to structure this, but I'm not aware of any such attempts, and I doubt you'd get a result as easy to use and understand as GraphQL.

> This also is related to the database, because the original point was that people go GraphQL -> Resolver -> DB, rather than HTTP -> HTTP Endpoint -> DB, and because they have chosen GraphQL, they must now write efficient, general resolvers that are essentially hand-built query optimizers whereas with REST you can build with higher granularity and usually higher, easier-to-achieve efficiency.

That's fair, but as you pointed out earlier, you're just shifting complexity around in any case. REST is simple because it's so granular, which is great when all of your data needs are met by that granularity. As soon as you have more complex data requirements, you end up either making ad hoc endpoints or evolving your API into a giant monstrosity so that you can handle the more general cases. GraphQL significantly simplifies this implementation on both the client side and the API side. Yes, the tradeoff is that now you need to optimize your data access manually, but as others have said, there are now tools for dealing with this issue—and it's not like you don't gain anything from it.

For a simple but concrete example, if I need to fetch data from ten different entity types in order to render a page in my web app, I can do so in a single GraphQL query and handle caching, etc in one pass, as opposed to needing ten different REST calls and having to consolidate them by hand. On top of the maintainability benefits of this approach, the productivity gains are huge.

[0]: https://www.odata.org/

> Maybe, but I think building GraphQL on top of REST would have produced something much more convoluted and verbose. As an example, see the OData standard[0], which, at least from the client's perspective, is unnecessarily complex compared to GraphQL. There may be cleaner ways to structure this, but I'm not aware of any such attempts, and I doubt you'd get a result as easy to use and understand as GraphQL.

I'd never heard of OData, thank you very much for the link! I've been meaning to give it a shot (try to put together a GraphQL using the other REST tools/standards), if I do I will definitely post it to HN. I agree that it would likely look more complex in the naive case but that could be ironed out -- the foundations feel identical. GraphQL's productivity and ergonomics are definitely something to strive for on the REST side.

> That's fair, but as you pointed out earlier, you're just shifting complexity around in any case. REST is simple because it's so granular, which is great when all of your data needs are met by that granularity. As soon as you have more complex data requirements, you end up either making ad hoc endpoints or evolving your API into a giant monstrosity so that you can handle the more general cases. GraphQL significantly simplifies this implementation on both the client side and the API side. Yes, the tradeoff is that now you need to optimize your data access manually, but as others have said, there are now tools for dealing with this issue—and it's not like you don't gain anything from it.

True -- growth of REST API endpoints is somewhat unbounded and it gets harder and harder to differentiate/name endpoints and separate functionality in a satisfying way. There was another comment[0] that showed the gains that a small team got from GraphQL, I can't argue with that. It does work extremely well for some teams.

> For a simple but concrete example, if I need to fetch data from ten different entity types in order to render a page in my web app, I can do so in a single GraphQL query and handle caching, etc in one pass, as opposed to needing ten different REST calls and having to consolidate them by hand. On top of the maintainability benefits of this approach, the productivity gains are huge.

This has been an oft-cited benefit of GraphQL, and it looks like a legitimate clear benefit -- not having to think about that is indeed very nice.

[EDIT] - Yeah after poking around OData for a bit, I don't think I'll be touching that but it's nice to know it exists.

[0]: https://news.ycombinator.com/item?id=26226642

> I've been meaning to give it a shot (try to put together a GraphQL using the other REST tools/standards), if I do I will definitely post it to HN.

Please do! I'd love to see what you come up with.