Hacker News new | ask | show | jobs
by colemorrison 3274 days ago
I've looked at GraphQL a number of times. Does anyone have any practical examples of integrating it with backend(s), APIs, and/or specific databases?

So instead of "we use GraphQL, much love" + basic example and how it looks on React - a "here's how we take that structure and resolve it and return it." Because that structure looks amazingly sweet - but if in the background it's requiring circles of work, work and rework...

Anyhow, maybe I just don't understand it enough.

11 comments

We are using the Sangria[0] framework with a Play[1] app. Sangria does all of the GQL related stuff and Play dos the usual server stuff. Sangria's documentation is quite good, but the part that answers your question will be in the "Schema Definition"[2] section, which is where you describe the schema of your graph, and how each field is resolved.

[0] http://sangria-graphql.org [1] https://www.playframework.com [2] http://sangria-graphql.org/learn/#schema-definition

This is definitely a real problem and we just launched a website yesterday which we hope will grow into a resource for this kind of more advanced content: http://www.graphql.com/guides/

You can also see a lot of examples of GraphQL server code for JS here: https://github.com/apollographql/launchpad

It includes connecting to DBs, APIs, etc.

You've hit a (pain) point. While graphql reduces the amount of trips to the backend for the browser, those round trips get pushed down to a lower level, between backend and db. The reference graphql-js implementation, while at first looks so easy, you just write a resolver for a field, makes it oh so easy to have terrible n+1 problems. dataloader helps a bit but it's not as optimal as it can be.

You need to be very careful. To be optimal, you'll basically need to write very complex resolvers that inspect the AST themselves and fetch the data optimally which at that point is almost as writing your custom execution module.

That is basically what i did, custom execution module to translate a graphql request to a single sql query (https://subzero.cloud/)

Sometimes I worry whether GraphQL is analogous to ORMs: a theoretically elegant attempt to solve an impedance mismatch between a data source and its consumer, but one that ultimately just shifts or redistributes the impedance to different layers of the codebase...
I agree it seems like potentially the same problem, and it manifests in forcing devs to write 'resolvers' which looks like horrible drudge work. So I wonder if there are decent offerings without the impedance mismatch between GraphQL queries and the database (so we don't have to translate them into SQL). I did a very quick look and at least found this: https://dgraph.io/

I'm getting the impression that if nothing else though, I'm probably gonna need to wait a few years for this tech to stabilize more. Which I find very unfortunate, because the standard way of doings things these days feels very unpleasant to me (I hate writing boilerplate more than anything: have an overuse injury, so typing is the worst part of coding)—and the GraphQL queries do seem to make a lot of sense (although forcing clients to have explicit knowledge of schema structure seems a little dangerous...).

Here's a project that might be interesting.

https://github.com/postgraphql/postgraphql

Technique inspired by PostgREST and graphql spin "stolen" (in a good way) from subZero :), although a completely different implementation method then subZero
That's really cool - we're doing the same thing, translating entire queries into a single SQL statement!
I hope you are using PostgreSQL! :) Check out PostgREST and the types of queries it generates
Yes, Postgres all the way. We don't use PostgREST though we construct the same type of queries.
> custom execution module to translate a graphql request to a single sql query

If I have a schema like this:

    users
      projects   (many per user)
      blog_posts (many per user)
What SQL do you generate to handle a request for everything in a single query?
take a look at PostgREST https://github.com/begriffs/postgrest

simplified example query https://pastebin.com/7J2ZswRC

Ah, nice trick - thanks for the tip.
Which n+1 problems doesn't dataloader solve?
It eliminates n+1 but not in a perfect way.

If you have a 3 level query (3 tables), the best you can hope for is to get 3 sequential queries, like get first level, collect the ids, request the second level, collect the ids, get 3rd level. It gets even more complicated teh more levels you and the bigger the dataset returned.

all of this can be done using a single join which is one roundtrip and it's faster.

If you have low latency queries (eg. because most things hit cache rather than db) sequential queries aren't as much of a problem. Being able to join everything into one query, on the other hand, is a luxury which can be hard to maintain at scale.

In the 'join in SQL' case you could identify particular cases which are doing sequential queries and implement a different loader which just does one. It's not automatic but perhaps in most cases it's not necessary to do this step anyway. In the worst case you're back to doing as much work as you would for a bespoke API endpoint, but that's not the typical case. How much of a problem this ends up being in practice very much depends on the type of app you're building and how you intend to scale it.

Saying join is a luxury is a dangerous thing :) (for impressional devs :)) 99% of projects are not "at FB scale" :) so join is exactly the right thing to use, it's been tuned over decades so until your scale/dataset does not outgrow one box (and there are big boxes now), you are not going to do a better job then the query optimiser (cause after all that is what you are trying to do).
Fair enough. What I'm trying to get at is that you need to pay attention to the cost of the real world queries which are actually being executed against your API, and then optimize. I'm not sure that SQL is a magic bullet there either (you can still request too much data at once, for example).

In practice this probably means adding some logging of how long requests take and graphing it (say, 95th percentile request time) from time to time to spot pathological queries. Even better if you can automate it. I think this is stuff that everyone should be doing (after a certain stage), regardless of whether you are using GraphQL or a bespoke JSON API.

I'd say build out your GraphQL server under the assumption that joins aren't always an option, but allow them as an optimization where possible.
Thank you, thank you. A critical piece of information.
You don't have to worry so much about n+1 (once you're using dataloader) as much as you do deeply nested queries, or queries that run against a large number of datatypes.
i am not sure i follow, the deeper the query or bigger the returned dataset, the worse the performance of a dataloader type solution (see my other comment).
We're agreeing here, I think. I'm saying that after taking care of the n+1 problem by using dataloader, you still have to worry about deeply nested queries.
This is true, but most complex UIs tends to result in queries that spread wide rather than deep. It takes some deliberate contrivance or fairly unusual real-world cases to get to more than 3-4 levels of nested relationships, and this is often the point at which you'd be thinking to deferred/lazy loading in the client anyway.

I've spent a significant amount of time over the last year or so optimizing GraphQL servers built on domain-driven services (i.e. joins aren't an option) and managed to get to equal (or very marginally worse) performance to existing handcrafted endpoints that returned equivalent data (it was possible to build the same UI, even though the payloads weren't identical).

There are areas where GraphQL is inherently inefficient (trying to work on ways to mitigate these issues), but the reality is that deeply-nested UI appears to be less of a problem than I originally thought it would be.

I strongly suggest that you look at Apollo's GraphQL offerings. My 'aha' moment with GraphQL came while reading the docs for their GraphQL server[0].

IMHO Relay doesn't make a lot of sense. Apollo Client [1] has a much better feature set for most use cases, doesn't need React and is better documented.

[0] http://dev.apollodata.com/tools/graphql-tools/resolvers.html [1] http://dev.apollodata.com/core/

+1 to that. I'm working on a (mostly) GraphQL application with a Rails backend and a React frontend with a touch of Redux. I really wanted to like Relay but the documentation was lacking (I suspect that a major API change is to blame) and the amount of boilerplate is prohibitive. On the other hand Apollo Client is straightforward, framework agnostic with great React bindings. As a web development veteran I feel that I've never been as productive as with this particular stack.
Used Relay Classic for a year and a half, been using Relay Modern for a month or so. You are absolutely right, the documentation situation is terrible. The fact that the new mutation API has practically zero documentation is troubling.

I've been sticking with it though, and I am enjoying it. I feel like I have a greater grasp of what's going to be executed and when than I ever did with Relay Classic, and the file size + performance improvements are worth the cost of admission in my mind.

Have you been using normal Relay or one of the forks/modified versions that supports server-side rendering? The fact that Apollo Client supports server-side rendering out of the box was a big plus for me.

> the file size + performance improvements are worth the cost of admission in my mind.

Is this Relay Classic vs Relay Modern or Relay vs Apollo?

The libraries that support server-side rendering aren't forks, they sit alongside Relay. But yes i've been using them for well over a year without any issues. I haven't tried it with Relay Modern yet, but there are examples out there of how to do it.

Relay Modern is 20% of the size (or 5 times smaller) than Relay Classic, which (if my calculations are correct, I don't have equivalent environments set up) is just over half the size of React Apollo + Apollo Client.

As a standalone all-by-myself indie developer I was skeptical of GraphQL when it first appeared, but later I discovered it is much easier to develop APIs for my own consumption with GraphQL. The basic GraphQL boilerplate seems bad, but is actually fun to write and makes total sense. And once you have the structure in place it is super easy to add functionality.

Plus bonuses if you have more than one database or are mixing data from your database and external APIs in your backend responses.

Please try it for a small project, it is unbelievable, but you'll probably enjoy it.

(For the record: I've just used https://github.com/graphql-go/graphql and Lokka on the client, because it is simple and does nothing fancy, it's a thin wrapper over XHR, I think.)

> Plus bonuses if you have more than one database or are mixing data from your database and external APIs in your backend responses.

You can easily do that with a RESTful API too.

You can do anything with any programming language, framework or design pattern.
Yes, hence it's not a bonus.
Here's a high-level article about performance in general, but it shows how a UI and query could map to (for example) a function-driven API (this could be local to the server or remote):

https://dev-blog.apollodata.com/optimizing-your-graphql-requ...

Examples more specific to a particular backend technology feel redundant because my assumption is that once you're in the land of calling functions, we don't need to hold your hands anymore.

The most important thing is to be aware of the different batching strategies that are available to you in each GraphQL implementation because I believe this is the most critical part of getting a GraphQL server to perform well with anything other than a graph database.

assuming all your data comes form the same database, you can reduce one graphql query to one sql query that you can run on PostgreSQL (which is not a graph database). So batching is not the only option (and probably not the best)
If you're using Rails there's a very good gem with some great examples:

https://github.com/rmosolgo/graphql-ruby

Ryan supports development with a pro version that has a bunch of neat features, including built-in support for a handful of common authorization frameworks. I highly recommend it.

On the server side it's very similar to REST. You define your types in a schema file, and then write functions that go fetch the data for each type. These functions are called resolvers.

For example you have a Product type, and then write the resolver function that queries the database and/or another API. When you have the data, you pass it back to the client via Apollo or Relay.

The big advantage over REST is that the client can define what data it wants and how it wants it. If you are full stack dev this isn't such a great advantage, but for bigger projects where front/back are spread among many engineers this can be an advantage. Also, since the schema defines the types, your API is almost self documented so to speak.

The big disadvantage is authentication and authorization. We kept using REST for authentication, and we couldn't find any ready made solution for role based authorization like you have in Express, Hapi, etc.

I think a combination of REST and GraphQL is the better approach.

I've written a fair bit about it. We use it at AppNexus internally for our UI's, and it simply maps back to a REST API. Since each API has its quirks and oddities, it's a nice abstraction layer for consistency.

http://joelgriffith.net/lessons-learned-wrapping-a-rest-api-... For starters

I agree. All of the examples are trivial. There are a lot of nice things about graphql. Yet there are a lot of problems and annoyances other things don’t have such as query batching. Even authorizing certain graphql queries / mutations is not a trivial thing. I’ve been able to solve a couple of these things in my own app, but it’s just not something built into the spec.
We're using graphql in front of a Drupal 8 data engine, with a variety of other data sources behind that. It's also for a publishing company, that has a bunch of different front end sites all served from the same content store... But with separate development groups. We're using the youshido graphql library for it... Though now that development is further along, I wish we'd chosen the webonyx one.

A good GraphQL backend resolves the graph into query results efficiently, rather than just field by field. But that does take a but of getting used to....