| Sorry but I would never use this format for both manual or programmatic approach. * I've tried to read the data this format describes without reading its documentation and I just failed: the format is amazingly counter-intuitive. I never had a readability and understanding issues with XML/HTML, JSON or even YAML (that I think is overly complicated) when I saw them for the first time. * Terse does not mean cryptic. Basic notation is just weird: why would it need unbalanced the less-than symbol to open the array? Why `<&>` for delimiting elements? Why `<<$>` but not `<$>>` at least just to be more readable by human and look balanced? The syntax goes more weird for arrays containing objects: indents (okay to some extent), `<>` and `<&>` (`{` and `}`?). * Auto-removing whitespaces may hurt. If the format offers this, would it also offer a heredoc-style text like `cat <<EOF` in Bash so that the formatting could be preserved as is? `xml:space` and JSON string literals were designed exactly for this. (upd: I just saw new symbol: `|`... Well, okay, but another special character now.) * Native support for arrays. I mentioned a few above. `<<Faults$$>` and `<<$$>` -- guess what these two mean if you see this first time? You would never guess. It's an empty array and an empty element, you've just failed. * Graphs... Another weird syntax comes into the room: `#id;` but `@id` (no semicolon?). Okay, these seem to be first-class ids and refs, not necessarily designed for graphs (I'm not sure if the `#ID;` and `@` would play perfect with any non-empty names.) But what does graphs make first-class citizens here and why? Graphs can be expressed, I believe, in any data/markup format/language and then processed with a particular application if graphs are needed. By the way, arrays and objects are not necessarily trees from the semantic point of view. More graph processing issues were mentioned in other comments to this topic. What about the first-class support for sets? I'm kidding * Comments. Another symbol here to come: `%`. To be honest, I can't recall any instance I could see the percent sign elsewhere for this purpose. What if the comments would start with a well-known `#` at least with a space right after it so that it wouldn't be considered a "graph id" (or, don't get me wrong, with another `<`/`$` sequence) * Just got to the Escaping section and now I see how the characters are escaped. Perhaps this is okay. * Scalars. Crazy number formatting and locale issues are waiting. The never-on-keyboard infinity symbol would be great for APL, but why not just Inf(inity)? Whatever the scalar value is, no need to cover all existing primitive scalars -- just let them be processed by an application since all scalars are text semantically. Another crazy things: what does make UUIDs that special for this format?; why does make Base64 that special so that it has native support (would it support Base16 for human-readable message digests; or Base58 to remove visually lookalike Base64 characters)? * CR/LF? I can understand its semantic purpose, but why not LF to make it even more "blazingly" fast? Say good-bye to UNIX users. * The cognitive load for the markup syntax absolutely does not make it efficient in typing. Believe me, it does not. What I would do, I would probably enhance the widely used formats, say make JSON, which I find almost perfect from the syntax point of view, not require quotes for object property names if the names would not contain special characters like `:` just like it goes in JavaScript. And perhaps make XML "v2" move away from SGML hence loosening its syntax to get rid of the closing tags with shorter notation, first-class array support and fixing syntax issues especially for CDATA and comments that can't support `--`. You would blame me, but I love XML the most: it just has the richest set of standardized amazing well-designed extensions to operate XML with regardless the heavy XML syntax. P.S. How does it look like in the document it marks up is minified (e.g., no whitespaces)? |
Having both begin and terminate arrays start with << is more consistent.
> `<>` and `<&>` (`{` and `}`?).
Using `{` and `}` would lead to more special characters.
> Auto-removing whitespaces may hurt.
It does not.
> Graphs... But what does graphs make first-class citizens here and why?
It is simpler to support graphs in the markup. The fact is that the data being serialized may be structured in a graph.
> CR/LF
It supports LF only ᴜɴɪx line ends as well as CR/LF internet line endings.
> Comments [...] To be honest, I can't recall any instance I could see the percent sign elsewhere for this purpose
LaTex and PostScript both use % for comments. # matches the usage in ᴄꜱꜱ and ʜᴛᴍʟ, relating to an id/page location.
> What if the comments would start with a well-known `#` at least with a space right after it so that it wouldn't be considered a "graph id"
Having a space after the # differentiate between and id and comment would be a mistake.
> Scalars. Crazy [...] UUIDs
The Formats section is to facilitate interoperability between implementations, e.g. if you are encoding a ɢᴜɪᴅ [easy to say] then format it this way.
> not make it efficient in typing.
It is more terse than ᴊꜱᴏɴ.
> XML "v2" ... first-class array support
Xᴇɴᴏɴ has first class array support, the xᴍʟ like syntax leads to the <empty-arrray$$> notation.
> P.S. How does it look like in the document it marks up is minified (e.g., no whitespaces)?
Good.