JSONB capabilities in Postgres are amazing, but the syntax is really annoying - for example, I'm forever mixing up `->` and `->>`. This new syntax feels far more intuitive.
Not much different from some_jsonb#>>'{some,path}' and once you add the need to convert out of jsonb to text, you'll not be saving any characters either. At least for queries.
It shouldn't matter so much, but when you don't use one language as much as you do other languages, it becomes that much harder to remember unfamiliar syntaxes and grammars, and easier to confuse similar looking operations with each other.
In that case this does not help. SELECT json['a']; will not return the value of the string in {"a":"ble"} (like it does in Javascript), but a JSON encoding of that string, so '"ble"'. You'll still not be able to do simple comparisons like `SELECT json_col['a'] = some_text_col;` Superficial familiarity, but it still behaves differently than you expect.
Is there even a function that would convert JSON encoded "string" to text it represents in postgresql? I didn't find it.
So all you can do is `SELECT json_col['a'] = some_text_col::jsonb;` and hope for the best (that string encodings will match) or use the old syntax with ->> or #>>.
> Is there even a function that would convert JSON encoded "string" to text it represents in postgresql? I didn't find it.
Oddly, no, there's no specific function for taking a root-level scalar JSON term (like '"foo"'::jsonb), and extracting said scalar to its equivalent native Postgres type.
You can still do it (for extracting to text, at least), but you have to use a 'vacuous' path-navigation to accomplish it, so it's extremely clumsy, and wastes the potential of the new syntax:
looks a bit meh. And custom right unary operators are on the way out, so one can't even create one for this use case.
Anyway, for fun:
create function deref_jsonb(jsonb) returns text as $$ begin return $1#>>'{}'; end $$ language plpgsql;
CREATE OPERATOR # ( leftarg = jsonb, function = deref_jsonb );
select '"sdfasdf"'::jsonb #;
select jsonb_col['a']# FROM somewhere;
Downvoters could instead provide a way to get a decoded text of a property with this new syntax, like it's possible with #>>. That would be more useful.
Wouldn't one big difference be that with this syntax, you can use bind-parameters / joined row-tuple fields / expression values as jsonpath keys?
ETA: no, actually, I was wrong — #>> takes text[], so you can already pass it an ARRAY[] literal containing expressions. It's just all the examples in the PG docs that use the IO syntax to represent the path, and then rely on an implicit cast to text[].
Would you be able to give a bit of context for the limitations of the new syntax that you’re pointing out? Could they be overcome (and if so why did it ship like this) or are they inevitable?
I don't think it's a limitation, it's just by design. a['b'] is equivalent to a->'b' not to a->>'b', otherwise deep references (a['b']['c']) would not work because first a['b'] would return text and not jsonb value.
Postgres is bowing to the inevitable, JSON support is too much in demand.
But this is going to be a classic example of bad design. Databases are a bad place to be storing JSON, which is a good interface and a bad storage standard. It is pretty easy to see how JSON will play out: some bright young coder will use JSON because it is easier, then over the course of 12 months discover the benefits of a constrained schema, and then have a table-in-a-table JSON column.
It isn't so out there to think that ongoing calls for JSON support will lead Postgres to re-implement tables in JSON. We've already got people trying to build indexes on fields inside a JSON field.
This is needless complexity engineered by people who insist on relearning schemas from scratch, badly, rather than trusting the database people who say "you need to be explicit about the schema, do data modelling up front".
I think PostgreSQL has always been very pragmatic. It's supported JSON natively since 9.2 (Sep 2012).
> Databases are a bad place to be storing JSON
You're right that "mature" features and projects have a very good understand of the schema. But not everything is that.
Suppose I want to collect info from the Github API about a bunch of repos. I can just store the entire JSON response in a table and then query it at my leisure.
There's also something to be said for contiguous access. Joining tons of little records together has performance problems. Composite types and arrays can also fill this void, but they both have their own usability quirks.
I use the json features of postgres to turn json into relation data. Store all json messages received in a table, then use a materialized view to extract the relevant parts into columns. Works well, and lets me keep the original data around.
> Databases are a bad place to be storing JSON, which is a good interface and a bad storage standard.
That's why in 99% of cases, Postgresql uses jsonb as storage standard, which is binary and compressed.
> This is needless complexity engineered by people who insist on relearning schemas from scratch
No, this is the right tool for situations where schemas are polymorphic, fluid, or even completely absent (like raw third-party data). I love SQL and following normal forms, and it is the right tool for most situations, but not all.
I've learned and applied my schema normalization and what have you got. But it's not the be-all and end-all of good engineering. What I greatly appreciate about hierarchical value storage in contrast to related flat records, is that it is so much easier to store and retrieve a tree. No need to generate ids and insert rows one by one, no need to decode the result of large joins. Because it doesn't only take time to write code for that, it can contain errors too.
If you've got hierarchical data and you just want to store, update and retrieve it as a whole (which is my use case), JSON is a good choice. Granted, it could be stored as a string/blob in my case. I don't really need to search within.
JSON in Postgres is a bit like a nail gun. Used correctly, it's incredibly useful. But in inexperienced hands (and lacking good technical leadership), it's easy to shoot yourself in the thigh.
You don't even need JSONB to commit war crimes on a Postgres database. There's many things that Postgres can do, but probably shouldn't be done:
- Storing "foreign keys" in an array column, instead of using a join table
- Storing binary files as base64 encoded strings in text columns
- Using a table with `key` and `value` string columns instead of using redis
- Pub/sub using NOTIFY/LISTEN
- Message queueing
- Other forms of IPC in general
- Storing executable code
- God tables
Even when trying to use Postgres appropriately, plenty of engineers don't get it right: unnecessary indices, missing indices, denormalised data, etc.
This isn't unique to Postgres, or relational databases in general. Any form of storage can and will be used to do things it's not designed or appropriate for. You can use as easily use S3 or Elasticsearch for message queuing, and can even find official guides to help you do so. Go back 20 years or so, and you can find implementations of message busses using SOAP over SMTP.
The problem isn't JSONB (or any other feature). It's bad engineering. Usually it's an incarnation of Maslow's Hammer: when all you have is a hammer, everything looks like a nail.
> Storing "foreign keys" in an array column, instead of using a join table
Very bad idea for a base table, sure. OTOH, potentially great idea for a (possibly materialized) view (as might eagerly storing an array of row values instead of keys.)
> Storing binary files as base64 encoded strings in text columns
That might not be a bad idea depending on size, and depending on how often you needed the binary vs. a base64 encoded string: if most of the use of the binary is in a context where it will be sent as base64 encoded, storing it that way might be a great idea.
> Using a table with `key` and `value` string columns instead of using redis
If you need an in-memory cache, sure. Otherwise...you could use redis, but I’m not sure why it would always be preferred.
> Pub/sub using NOTIFY/LISTEN - Message queueing
NOTIFY/LISTEN are a pub/sub mechanism. You shouldn’t use them alone as an application level message queueing system, but you definitely can build such a system on PG and might well use NOTIFY/LISTEN in the implementation.
> - Storing executable code
There are probably problems that involve specific instances of doing this, but this is at best to general to describe a thing yoi shouldn’t do.
Yeah re the binary base64 encoding - we found that when using text based queries (the default for pg gem on ruby and most other client libraries AFAIK), base64 in a text column outperformed byte columns as it avoids extra conversions to and from base64.
We did switch to using binary queries though (by passing result_format as 1), and at that point we did see a speedup from using bytea columns - but it’s quite a bit of work to get the type mapping correct then so probably not worth it in most cases
> Databases are a bad place to be storing JSON, which is a good interface and a bad storage standard.
JSON makes perfect sense in a database that already supports all of: BLOB, TEXT, XML, ARRAY, and composite datatypes, including any datatype, including those on this list, for members of ARRAY and composites.
OTOH, Postgres has had XML since v8.2 (2006) and JSON since 9.2 (2012), and “tables in <supported structured serialization format>” hasn’t happened yet, even as discussion item AFAIK, so perhaps it would be bad, but even so it seems to be just fantasizing something to worry about.
> This is needless complexity engineered by people who insist on relearning schemas from scratch, badly, rather than trusting the database people who say "you need to be explicit about the schema, do data modelling up front".
The reason is with some projects/data it's hard to be explicit about the schema which is why NoSQL had it's popularity phase.
Now most applications don't have either entirely structured or entirely unstructured data, they will have a mix - so it's absolutely brilliant for one tool to do both. If they didn't support JSON I have a strong suspicion that they wouldn't have had some of the growth we have seen for Postgres across the last few years.
JSONB capabilities in Postgres are amazing, but the syntax is really annoying - for example, I'm forever mixing up `->` and `->>`. This new syntax feels far more intuitive.