Hacker News new | ask | show | jobs
by o11c 877 days ago
The most interesting part here is the constructed polyglot:

  with ambiguous(j) as
      (select '[' || char(9) || '721]')
  select
          json_valid(j, 0x1) as "RFC 8259",
          json_valid(cast(j as blob), 0x8) as "JSONB"
      from ambiguous;
Also, this is an example of why dynamic typing is never safe, whether in a normal programming language or in your sql database. It turns out that the sqlite documentation has been lying for years and people assumed everything was fine.
2 comments

That got me nerd-sniped. The following grammar describes all possible JSON-JSONB polyglots:

    d = '0' | ... | '9'
    e = 'E' | 'e'
    
    int-frag = d d d
    canon-float-frag = int-frag | d e d | d '.' d
    float-frag = canon-float-frag | '.' d d | d d '.'
    float-suffix-frag = float-frag | e '+' d | e '-' d | e d d | '.' e d
    
    string-type = '7' | '8' | '9'
    
    polyglot-json =
      '3' int-frag |
      '5' canon-float-frag |
      '6' float-frag |
      string-type float-suffix-frag |
      '[' '\x09' string-type float-suffix-frag ']'
There are some additional possibilities like `40e3` which are rejected by the validation code but otherwise accepted. (`4` encodes a three-byte-long hexadecimal literal.)
> It turns out that the sqlite documentation has been lying for years and people assumed everything was fine.

More like the SQLite docs made an assumption and people, maintainers and users, didn't realize it until later.

Why the sudden hostility?

Because when you explicitly document "if you do this, it will fail", and people's code doesn't fail, they assume they must not have been doing that.

Especially remember that most people don't care about opening files in text mode vs binary mode. This is only more true in a UTF-8-only world.