Hacker News new | ask | show | jobs
by tripple6 595 days ago
I love this.

> Having both begin and terminate arrays start with << is more consistent.

It hides context for humans. I am a human and I love to see what opens and what closes the context. Why would `<` open an array if `[` is astonishingly wide-spread practice? Why would `<<` close it just because you think it is more consistent? What if open/close balance is also consistency, especially for nested arrays?

Also just think how many key strokes you'd save if you'd use `]` instead of [Shift]+`,` [Shift]+`,` [Shift]+`4` [Shift]+`.` if you declare it as readable text.

> Using `{` and `}` would lead to more special characters.

Agree. Too many now.

> It is simpler to support graphs in the markup. The fact is that the data being serialized may be structured in a graph.

I can't understand why you call it native graph support. The only thing it does is declaring an identified element and references to the element. I can't see how different is that comparing to XML or JSON that semantically "have graph support" just because they also can declare something considered ids and references to the identified element.

> LaTex and PostScript both use % for comments.

Yes, just learnt that from your comment and https://news.ycombinator.com/item?id=42047634 by zzo38computer. Thank you.

> # matches the usage in ᴄꜱꜱ and ʜᴛᴍʟ, relating to an id/page location.

No. The # symbol is overloaded: it may be a comment start, especially for line-oriented and human-readable text formats or scripts; CSS uses it for IDs; HTML has nothing to do with it since browsers only use # as a part of a URL to reference a particular identified element for navigation purposes only (it's called anchor in URL syntax; formerly web-browsers used <a name="anchor"> to navigate to a part of the page; as of now in the HTML5 world any `id` attribute is considered an anchor which I find a design flaw since ids are something to be used to identify hence any id from the document is exposed for navigation navigation purposes, but <a name="anchor"> is semantically something for navigation).

> Having a space after the # differentiate between and id and comment would be a mistake.

Of course it would in its current perspective if the id declaration is `#`. Don't know what `#<NON_WHITESPACE_CHAR>` would do if it's legal.

> The Formats section is to facilitate interoperability between implementations, e.g. if you are encoding a ɢᴜɪᴅ [easy to say] then format it this way.

I agree that it may look better for consistency purposes, but what interoperability is all that about? Why would formatting even affect it? From the consumer application point of view, it must be handled from its context defined by its purpose and semantic type. If my element/attribute is formally declared as a GUID, then why would I care that much if it's conventionally formatted? Would it be still a GUID if I encode it using Base64? The dashes in GUIDs are for humans only and they are optional, and the application knows it's a GUID to process it even leniently if it can. The same goes for ISBN/ISSN for books and magazines, card numbers, phone numbers, etc -- none of them require dashes or spaces or parentheses to be processed.

This is why "Real numbers *should be stored* with commas for readability." is just hilarious. Why should? May I use underscores or dots or spaces to group digits (seriously, why comma)? Can I group digits after the period? If I need integers, why are they also limited to 32 bits and 64 bits? How would I present an arbitrary precision integer or non-integer number (say, I want the Pi number 197 digits after the 3)? If ∞ is allowed, but no mention on +Inf and -Inf, can be 4.2957×10^24 used instead of 4.2957e24? May I just have simple `D+(\.D+)?` for everything I need for true interoperability?

I agree consistent formatting is really beautiful, but it must never be the key to process data.

> It is more terse than ᴊꜱᴏɴ.

Sorry, it's not.

> Good

Could you please provide an example of minified (a single line, no new lines) array of timestamps from your page?

UPD: I've just seen https://news.ycombinator.com/item?id=42038508 by Oras . Well, you know.

----

In short, too many whys, weird syntax and design decisions, so I cannot see anything that makes it a "better alternative" to XML, JSON, or YAML.

1 comments

I don’t love this.

> I can't see how different is that comparing to XML or JSON that semantically "have graph support" just because they also can declare something considered ids and references to the identified element.

When serialising data with ᴊꜱᴏɴ one has to use special field names such as $id; hoping the programming language does not. It DOES have native graph support that xᴍʟ and ᴊꜱᴏɴ do not.

> # [..] it may be a comment start

No.

> but what interoperability is all that about?

Interoperability between implementations. If you were using Xᴇɴᴏɴ to communicate between two different languages, say the C# and a Python implementation, agreeing of what an integer IS is helpful. Both Xᴇɴᴏɴ libraries can provide support for encoding say ɢᴜɪᴅs. You have missed the point. A user is always free to encode data as arbitrary strings.

> commas [...] readability." is just hilarious. Why should?

Commas makes numbers faster to interpret. Something `ls` is missing. As I stated on another branch English is the global lingua franca so commas every three digits is the standard.

> ∞ is allowed, but no mention on +Inf and -Inf, can be 4.2957×10^24 ∞ is +Inf. 4.2957×10^24 is not the xᴇɴᴏɴ standard.

>> It is more terse than ᴊꜱᴏɴ.

>Sorry, it's not.

See https://news.ycombinator.com/item?id=42049033

<<Timestamps>2026-09-24T16\:45\:22.5383742<&>2026-10-04T18\:25\:12Z<&>2026-04-02<$>>

Better than ᴊꜱᴏɴ which does not do timestamps.

> When serialising data with ᴊꜱᴏɴ one has to use special field names such as $id; hoping the programming language does not.

Unless a serialization/deserialization tool supports property name overriding which is trivial.

> It DOES have native graph support that xᴍʟ and ᴊꜱᴏɴ do not.

Again, how is this different from `xml:id` that is referenced from other XML document nodes and what makes it "native graph support"?

> Both Xᴇɴᴏɴ libraries can provide support for encoding say ɢᴜɪᴅs. > Better than ᴊꜱᴏɴ which does not do timestamps.

Better?

There is just no need. For what? These two can be controlled by optional schemas that may be extensible like types to validate in XML Schema or Relax NG. Schemas do not dictate format and you don't need your format to be a schema. I still can't get what makes timestamps (and GUIDs) so special so that they have special sections in your document.

I tend I think JSON also has a design flaw providing first-class support for booleans and numbers in terms of literals it took from JavaScript because the latter needs more complex syntax as a programming language. Ridiculously, XML seems to be perfect in this case unifying scalar values: whatever scalar it encodes, text representation can encode it in any efficient format regardless it is a boolean, number (integer, "real", complex, whatever special), a "human-text" string, timestamp or whatever else; HTML attribute values unlike XML don't even need to be quoted in some trivial cases and even may be omitted for boolean attributes. The application simply parses/decodes its data and manages how the data is deserialized. That's all it needs.

I would probably be happy if, say, there would be a format as simple/minimalistic as possible not even requiring delimiters like or quoted strings unless they are ambiguous. Say, `[foo 'bar baz' foo\ bar Zm9vYmFyCg== 2.415e10 ∞ +Inf -∞ -Infinity \[qux\] +1\ 123\ 456789 978012345678 {k1 v1 k2 v2} aa512e8ecf97445eac10cb5a5ea3ef63 c8a0ebbd 2026-09-24T16:45:22.5383742 P3Y6M4DT12H30M5S]` or similar, maybe with nodes metadata and comments support. The above dumb format covers arrays/lists/sets, strings `foo`, space-containing `bar baz`, `foo bar` strings in human and Base64 encoding, the `2.415e10` number from your document and both four infinity notations, a single string `[qux]` and not a nested array with a single element, a phone number (with space delimited country code, region code and local number), an ISBN, a simple map/object made of two pairs, a GUID, a CRC32 checksum, an ISO-8601 zoned date/time, and an ISO-8601 duration. What more scalar types it can be extended with? Since there is no type for scalars in this "format" does not dictate types or preferred scalar formats letting the application make decision how to interpret these on its own.

> Commas makes numbers faster to interpret. Something `ls` is missing. As I stated on another branch English is the global lingua franca so commas every three digits is the standard.

For whom? Humans? Why would data encoding obey region number|date/time notation standards at all? English, but US, UK, Canada, or any other English-speaking country? You've been told that in that thread too, especially if spaces or underscores are even more readable for monospace fonts. You don't need it.

> See https://news.ycombinator.com/item?id=42049033

Funny enough -- your format saves on key/value pairs syntax appealing to 4 vs 6 overhead (okay, cool), but your array elements delimited with `<&>`, and amazingly bad at keyboard typing ergonomics, loses to simple and regular JSON `,` syntax (3 vs 1 overhead). Isn't it blind or crazy?

> Again, how is this different from `xml:id`

It is a tidier solution.

> I still can't get what makes timestamps (and GUIDs) so special so that they have special sections in your document.

They are common in data.

> [...] boolean attributes

Separate attributes and sub elements is a mistake. One should be able to guess an ᴀᴘɪ.

> What more scalar types it can be extended with?

> letting the application make decision how to interpret these on its own.

That is laborours! A Xᴇɴᴏɴ library provides AsGuid, AsDateTime etc.. and serialization directly to/from those types.

>For whom? Humans?

Yes. Human have to read markup.

> Why would data encoding obey region number|date/time notation standards at all? English, but US, UK, Canada, or any other English-speaking country?

I repeat! READABILITY.

> Isn't it blind or crazy?

No, quite the opposite.

> It is a tidier solution.

Based on special syntax. You're about to introduce node attributes.

> They are common in data.

I use tables everyday. May I have "first-class graph support" but for tabular data that is very common as well? I expected three or four times you eventually explain what makes the graph support and how it differs from declaring ids and refs in other formats you think are worse than yours. No answer.

> Separate attributes and sub elements is a mistake. One should be able to guess an ᴀᴘɪ.

For the first, I kind of agree that attributes and subnodes should be unified in favor of subnodes (which was sacrificed for markups like HTML for sane brevity sake). However attributes, your ids are, may be metadata for nodes of any kind. For the second, API for what? Document generating/parsing API? Validation API? Serialization/deserialization API? Enveloped application API? I guess, the latter for whatever reason dictated in your "standard" . In any case documentation, schemas, data validators and autocompletes are my best friends, no need to "guess".

> That is laborours! A Xᴇɴᴏɴ library provides AsGuid, AsDateTime etc.. and serialization directly to/from those types.

What you're mentioning is called serialization and deserialization, and these two be easily implemented once for "basic" types and extended at the application level for any kind of data, because an application decides what to do with data on its own, not the format the data is enveloped in. Serialization and deserialization don't exist from the format perspective which only defines the syntax way data is marked up in a document. So why would it care the formatting at all?

> Yes. Human have to read markup.

Format should not care too much.

> I repeat! READABILITY.

No yelling please. Regional formats are defined by countries, not languages you said elsewhere, just by definition, even if English is the lingua franca. Separate digits with underscores or spaces.

I'm very happy your "standard" neither recommend color highlighting for, say, numbers, nor even worse has special syntax for readability highlighting. Highlighting increases readability greatly as well, you know.

> No, quite the opposite.

6:4 but 1:3 is a great syntax win. Okay.

No any solid counter arguments from your side being blind for obvious design flaws of your so-called format "standard" only tells how you mixed up all concepts in a mess of crazy syntax markup, and scalar object formatting for scalars that only must be handled by applications while serialization and deserialization regardless the markup format "standard" recommends.

Good luck with your "standard" rightly criticized and rejected by others, but better just bury it not spending your life for nothing. Sincerely.

> You're about to introduce node attributes.

Yes, but limited to #id and :type.

> tabular data that is very common as well?

Xᴇɴᴏɴ has first class arrays also so tabular data could be stored as such.

> explain what makes the graph support and how it differs from declaring ids and refs in other formats you think are worse than yours. No answer.

It is built in!

> So why would it care the formatting at all?

FOR INTEROPERABILITY! That is different implementations of xᴇɴᴏɴ agree on what a ɢᴜɪᴅ or date looks like! Fʏɪ, with a good implementation of xᴇɴᴏɴ you just point the library at your data, sometimes augmented with some attributes, and you get cleanly formatted markup.

>>One should be able to guess an ᴀᴘɪ.

> For the second, API for what?

Say you are using an ᴀᴘɪ for information about a person and their is information about their height, in xᴇɴᴏɴ one knows there shall be a scalar called “Height”, in xᴍʟ it may be an attribute or a sub element.

>> Yes. Human have to read markup.

> Format should not care too much.

We are using text formats because they are READable to humans.

> Separate digits with underscores or spaces.

That is not standard anywhere.

> [...] color highlighting

Only the application knows if a scalar is a number or a string.

There are no obvious design flaws. Take xᴍʟ, add an array type and xᴇɴᴏɴ results.

We must be talking a cross purposes re formatting. [phew...] An application has an object called Person, and a field called Height with a type of double. C♯: Person fred = new Person { Height = 1.67 }; string xenon = XenonStart.Serialize("person", fred), results in the string "<person><Height=1.67><$>". A xᴇɴᴏɴ implementation in another language, say JavaScript can take that xᴇɴᴏɴ string and decode it into an object with a field called Height with a value that can be decoded .AsNumber into 1.67; because there is a standard for encoding a ɪᴇᴇᴇ 64 bit number/.net double/JavaScript number.

Xᴇɴᴏɴ has more benefits.

> * Native support for arrays. I mentioned a few above. `<<Faults$$>>` and `<<$$>>` -- guess what these two mean if you see this first time? You would never guess. It's an empty array and an empty element, you've just failed.

<< means it relates to starting an array, $>> means it is the end, $$ meaning something else — an empty array!

The xᴍʟ alternative is a bodge:

    public class PurchaseOrder
    {
        public Item[] ItemsOrders;
    }

    public class Item
    {
        public string ItemID;
        public decimal ItemPrice;
    }
serializes to:

    <PurchaseOrder>
        <ItemsOrders>
            <Item>
                <ItemID>aaa111</ItemID>
                <ItemPrice>34.22</ItemPrice>
            </Item>
            <Item>
                <ItemID>bbb222</ItemID>
                <ItemPrice>2.89</ItemPrice>
            </Item> 
        </ItemsOrders>
    </PurchaseOrder>
Where the array is marked up as two sub elements both called <Item>:

Xᴇɴᴏɴ has first class support for arrays:

  <PurchaseOrder>
      <<ItemsOrders>
          <ItemID=aaa111>
          <ItemPrice=34.22>
      <&>
          <ItemID=bbb222>
          <ItemPrice=2.89>
      <$>>
  <$>
The elements may be scalars so

  <PurchaseOrder>
      <<ItemsOrders>
      <$>>
  <$>
has an array with one item of the empty string. So a separate syntax for empty arrays is required!

  <PurchaseOrder>
      <<ItemsOrders$$>>
  <$>