Hacker News new | ask | show | jobs
by rhinoceraptor 3704 days ago
I love `json_build_object()`.

With `json_build_object()`, you can select fields as a JSON object and not have to do that JSON building in your app.

Here's a snippet demonstrating building, and then querying a JSON object:

  with location as (select json_build_object(
  	'street1', street1,
  	'street2', street2,
  	'city', city,
  	'state', state,
  	'zip', zip
  ) as loc from my_table)
  
  select
  	loc ->> 'street1' as street1,
  	loc ->> 'street2' as street2,
  	loc ->> 'city' as city,
  	loc ->> 'state' as state,
  	loc ->> 'zip' as zip
  from location;
1 comments

Honest question, why do you consider this better? I know there are development strategies that prefer to put as much model control as possible in the DB, but there are others that prefer to keep the DB more clearly as data, and not the end representation of that data (it's a trade-off, I don't see one as clearly better than the other. One allows for tighter control of data, the other allows more scalability and the use of more tooling).

I guess my question is, to you, what's the main benefit of building JSON in the inthe DB instead of your app?

There are some quite impressive performance gains that can be exploited - especially beneficial if you're using a framework with higher overheads. Using Rails, for example, it's possible to generate JSON output directly from Postgres and stream it to a client, instead of instantiating zillions of ActiveRecord objects just to query flat fields.

There is an obvious trade off of flexibility, but this technique is beneficial in that it allows rich object graphs through the ORM where required, while performing well for simple operations.

I see that more as working around the limitations of the ORM, not really as an inherent strength of doing it inthe DB. For example, in Perl and using DBIx::Class, I would get around that problem by applying another ResultClass to the ResultSet (the HashRefInflater ResultClass), so instead of inflating results into DBIx::Class objects, it returns a hash directly. With some of the helper classes, this is as simple as changing $result_set->search( \%criteria )->all; to $result_set->search( \%criteria )->hri->all;. This allows you to move the serialization task to the controller, and in what should be a fairly efficient way. I can't imagine ActiveRecord doesn't have something equivalent.

Now, where I could see returning JSON being really useful is if you can build complex structures out of it, in a way that follows constraints. E.g.

    {
      "title" : "Cool Hand Luke",
      "released" : "1967-11-01",
      "cast" : [
        { "name": "Paul Newman" },
        { "name": "George Kennedy" },
        ...
      ],
    }
If that can be efficiently generated on the DB, that could help immensely with the current state of prefetching relationships, which to my knowledge, currently requires either redundant data through joins, and is thus increasingly inefficient the more items you relate (such as with DBIx::Class's prefetch), or uses multiple queries (ActiveRecord's :include), which includes either processing between queries or duplicating portions of results in subsequent queries to get the correct subset of relations.
Another advantage from a performance standpoint is that the data used to generate the JSON output is likely already cached in memory by your database engine. With SQL Server (and presumably with Postgres) JSON can be indexed in various ways, leading to even more performance gains.
I have to tell the database which columns I want in either case. The only difference (in my opinion) is now it's in a more convenient form, especially if you have a Node/Ruby/etc app where json/jsonb types is put into a native hash object by the PG library, and you would have to build that object anyways.
Are you telling me there's not a way in Ruby or Node to query the database and put each result directly into the native hash object equivalent? That seems like the sort of thing that gets optimized fairly early in a database interface's existence. That JSON ends up being faster would surprise me, as there's still a deserialization step that needs to take place (JSONB might indeed help with this aspect if available though).

Now, if you are indeed just passing the JSON directly through to some client further up the stack, there probably is some benefit to generating JSON on the DB, as long as your DB isn't CPU constrained. For any other purpose, I imagine generating JSON would be less space efficient and require additional and/or more onerous steps to converting the record to the appropriate language construct.