Hacker News new | ask | show | jobs
by nikeee 1704 days ago
Since strings don't need to be quoted, what happens during deserialization if you want the string "T"? Does this lead to the equivalent of the Norway-Problem of YAML [0]?

Is the space between the key and the type necessary? If not, how to distinguish between objects and types?

Does the validation offer some form of unions or mutual exclusion?

[0]: https://hitchdev.com/strictyaml/why/implicit-typing-removed/

3 comments

YAML and its "Arrays" are really broken. The problem I see with Internet Object is that it's also implying this kind of mechanism.

Every time I read about new formats, they seem to get either the 1-n relations or the n-n relations implemented well, but not both. I guess that's what's so hard about map/reduce...

Regarding YAML: somebody on HN mentioned his project DIXY a couple years ago, and it's much much _much_ easier to parse than YAML. [1] I'm using this over YAML pretty much everywhere now.

[1] https://github.com/kuyawa/Dixy

Yaml has so many problems. Python 3.10 raised a new one to my attention when the core devs realized their arrays of versions contained twice 3.1 and no 3.10. Indeed, if write unquotted ascii, yaml gives you strings. Except if it can cast it to a number that is.

TOML is better, but it still has more gotchas that necessary. So much I find it easier to just edit a python file

I'm thinking of giving a try to cue. Any feedback ?

Dixy looks easy, but "There is only one simple rule. In Dixy, everything is a dictionary [string:string]" isn't accurate or helpful.

It's also [string:dictionary] and [string:?] where ? means nil. White space matters, and tab is fixed at 4 spaces wide. When creating text from a dictionary it adds "# Dixy 1.0\n\n" which means loading and saving will change the file every time! Not sure what other issues there are, but I noticed this line:

    // TODO: if key is numeric, parse as Array
It does look simple though. It'd be nice if someone made strict rules and addressed the corner cases.
> YAML and its "Arrays" are really broken.

Agreed. YAML does have some use cases. I find it useful when I want to manually write lots of JSON data for test scripts. But the format, because it tries to be concise, ends up to be hard to manually parse.

I don't consider YAML a good serialisation format.

The annoyance of YAML is the possibility of doing things in different ways.
I’ll admit that YAML has its quirks, but a good syntax highlighter can take care of that in my experience. What’s wrong with YAML’s arrays?
> What’s wrong with YAML’s arrays?

That there are multiple ways to define Arrays: "- item", "-\n\titem", "\titem" or "item, item" for starters. Parsing YAML into Arrays requires context of its surroundings.

Without the previous context, you cannot know what type of data you're parsing when you are at a "-" at the beginning of a line or a "," in the middle of a line.

This is just unnecessary parser complexity and human ambiguity in my opinion.

As a question to you in case you disagree: What happens when you write down an indented/nested "\t- name: John, Doe"? It's pretty much unpredictable without the previously parsed data structures or their history in YAML.

(I don't wanna start the discussion of "<<" and how it influences the parsing context of YAML data structures. I think the merge key also has no place in a data serialization format.)

It seems to be a typed CSV, so whether `T` is interpreted as a string or a boolean presumably depends on the schema. That sounds slightly better than YAML, though it can easily break when you allow heterogeneous types (say, string or boolean).
T is quite dumb. The author should had at least used #t and #f from Scheme.
The so-called "Norway Problem" of YAML is really the No-Way Problem of YAML. /s