A little confused. The author is arguing _against_ JSON in URLs, right?
The piece starts as a list of complaints about hacky query parameter encoding, and has the feel of one of those "aren't these things annoying... but here's the right answer!" posts, except instead of a correct answer we get a discussion of how JSON is nice in some cases but we shouldn't use it in URLs, except maybe sometimes.
Is this a response to some post about how JSON _should_ be used in URLs? Overall, I struggled to find a coherent narrative / argument here. Perhaps something went wrong in the article's editing process.
Even this article's final conclusion is that this is a bad idea.
Data within the URL also has a number of other issues including:
- It is easier to leak (e.g. browser history, proxies, some browser extensions, etc). Very few things record HTTP POST parameters unless they're doing something evil, very many non-evil pieces of software record the full URL.
- Users intentionally or inadvertently re-posting data is much more likely. No browser's autocomplete re-posts HTTP POST parameters to the website, many will do so with HTTP GET parameters, which could result in a worse user experience (or in rare cases the user performing actions on your website within intending to).
- Maximum length
- The encoding/decoding step could be extremely expensive for some data. JSON itself requires a lot of decoding but depending on what you're moving it could mean almost all of the data requires it making the URL string insanely massive.
- Users will copy/paste these URLs to one another and the JSON will remain in all its 200+ character ugly glory. This may not be a security issue but it is a user perception issue. URLs are meant to be getting cleaner/more human readable.
For your two User points, both revolve around "Users" having access to these APIs, however the article argued that these should be deeper APIs. Not for human consumption, right? I feel like that was part of the (vague) point - Params give a URL human-usability. To go beyond that and to get more power out of a URL, they (vaguely) propose JSON as a non-human friendly format.
So for these deeper APIs, how many humans are accessing these?
And, most importantly, i don't believe the article was arguing against POST (in fact, it doesn't mention POST.. once). I believe the argument was for times when you are already using URLs to convey data, such as in a search query, that JSON might be a useful alternative to plain HTTP Params.
A JSON like syntax that is URL query compatible sounds like a great solution to me. Rison tried this approach a couple years back: https://github.com/Nanonid/rison
Rison (specifically O-RSON) is a great solution to this problem. We use it in production and have done so for a couple years now, and it's served us very well to address this problem.
I designed something similar while writing scoutcamp (an express.js competitor). I made it so that it could parse both normal queries (eg, `?age=12&cards=visa&cards=mastercard`) and JSON (eg, `?age=12&cards=["visa","mastercard"]`).
I'd argue that design is easier to read and allows graceful degradation: someone using an API with fields `age` and `cards` can rely on the traditional way to work. They can easily reformulate their query when more complex structures are required. Yet Ajax scripts can use a simple shim to make XHR calls that send arbitrary JSON data.
'I made it so that it could parse both normal queries (eg, `?age=12&cards=visa&cards=mastercard`) and JSON (eg, `?age=12&cards=["visa","mastercard"]`).'
It's worth pointing out, since even here in 2015 people often don't realize this, that the querystring is permitted by the standard to contain more than one value of the same type. Thus, your second example in this case could also be cards=visa&cards=mastercard.
This is particularly a problem when you've got an overly-helpful web framework in a dynamically-typed language that normally exposes querystring parameters as a string, but if more than one gets passed in, suddenly you have an array. This can cause all sorts of fun problems, including security failures, though usually just bugs.
(Even in a dynamically typed language, a web framework should either represent querystring parameters as an array at all times xor enforce a clear, stated policy of which string it accepts if it is going to pick just one, but it should definitely not just split the difference. The latter, despite in some sense being "wrong", does have the advantage of matching people's mental model, which is a useful enough thing on its own.)
This is, incidentally, a useful "Are the developers at least this tall?" test to apply to your new Web Framework o' the Week.
Basically everything about the status quo is messed up.
JSON into parameters: bad.
JSON into request body: undefined with the GET method.
JSON into request body with POST: great, but we're not actually mutating anything, so it's bad practice and won't be cached by varnish/nginx + a barrage of other problems.
I really wish someone would take up on http://www.ietf.org/rfc/rfc3253 - like REPORT as an analogue to GET, just with a JSON request body. Hey, we as a community finally managed to get PUT and PATCH on track, why not this one?
PUT and PATCH on track? I'm going to have to look into that. I had no idea progress was being made there, and i thought PUT and DELETE were basically dead.. atleast, in hopes of browsers being able to use them.
I'm not sure what the reference to PUT is (its a base HTTP method) but I assume that PATCH refers to getting it added as an RFC-specified extension to HTTP/1.1.
> and i thought PUT and DELETE were basically dead.. atleast, in hopes of browsers being able to use them.
Browsers can use PUT and DELETE.
HTML Forms however, support only the post and get methods. (I don't really get why: the semantics as form methods for PUT, PATCH, and DELETE as form methods seem pretty obvious given those of GET and POST.)
When I pass around any parameters in GET or POST parameters I wrap them in base64. That makes a lot of escaping bugs go away (and adds a bit of "security by obscurity", as well as true security when combining the query with a random number, a sha256 hash of the parameters and a serverside secret).
Sounds like some pretty crappy hand rolled security to me. Don't ever think of something as secure unless it actually is a secure protocol. You are wasting your time and your code is probably exploitable.
Not that this matters to your overall point, but base64 isn't actually a valid format to use in a parameter as a base64 string can legally contain: '+', '/' and '=' which would be interpreted and corrupt the data.
In the .Net world you'll want to use something like HttpServerUtility.UrlTokenEncode()/UrlTokenDecode() since it gives you a base64-like string with '+', '/' and '=' replaced or removed.
I used JSON in the URL to preserve the state of a report across reloads. I kind of regret it, because it wouldn't have been that much work to maintain a URL with normal query args. I think I had in mind that JSON would better handle things like the collapse/expand state of nested groups (a tree, essentially), but mostly I was just lazy, considering that I never got that far.
Agreed. I was thinking the same thing. A fast and small encoding _(whatever that encoding may be)_ would help immensely, and is far more sane than url encoded JSON, imo.
We actually do something similar to this to make our application state (within limits) be reflected by the url and support navigation — both good things. The urls are horrible though and we're considering simply storing the state in a service and giving it a serial number (but this has its own issues).
I was hoping the article was proposing a better way of encoding Json as urls, e.g. using the characters that are allowed in place of curly braces etc and doing simple translation. E.g {"foo":"bar","baz":[]} becomes foo=bar&baz=&&
Ahh but that is not the value that is sent to the webserver, that is just your web browser prettying up the display. If we open the inspector and see what was sent we have:
We have used this syntax for url filters, works surpricingly well:
/v1/movies?year>=2011&artist!=cage
meaning - to add more meaning to the parameters in the query string, just add a suffix to the fieldname ><!~ etc. Before using the query, just filter them out. To add an array of options, just add the same field twice or use comma in the value:
I'm all for (ab)using specs whenever convenient, but this is just such a bad idea.
You will rarely actually use query args (ie ?x=y&foo=1) with a true REST API anyway. Use your resource path and avoid query args except for idempotent GET operations. If you have complex data, just put it in a request body as JSON. Don't put that crap in a query arg. That's not what it's for.
What benefits do you give browser / caller caching for GET?
It really feels like they needed to get some content out for some reason, and just found some post in some internal dropbox wiki and posted it. Maybe they posted a draft by accident? I dont know...
A simple solution that can often work is to store the JSON in a database, assign a number to it, and use that number in the URL instead of the JSON. One could even add a cryptographical checksum (e.g., hash with secret salt) to make it a bit safer.
The piece starts as a list of complaints about hacky query parameter encoding, and has the feel of one of those "aren't these things annoying... but here's the right answer!" posts, except instead of a correct answer we get a discussion of how JSON is nice in some cases but we shouldn't use it in URLs, except maybe sometimes.
Is this a response to some post about how JSON _should_ be used in URLs? Overall, I struggled to find a coherent narrative / argument here. Perhaps something went wrong in the article's editing process.