Hacker News new | ask | show | jobs
by amirkdv 1763 days ago
> I think YAML is even worse as a serialization format than a configuration format.

This. I find YAML to be the least offensive option for configuration and one of the worst for serialization.

I might be misinformed, but I find it absurd that in 2021 we still don't have a default, universally available tool that supports the basic table stakes without headache:

1. core data types (number, string, etc)

2. lists, maps, and arbitrary nesting

3. comments

4. multiline strings

5. is acceptably readable

6. Just Works everywhere

YAML does get 1-5 right (specifically 3 and 4 that JSON doesn't and IMO better in 5). But then it adds a ton of complexity that has left us without a standard, safe, and sane parser implementation: anchors and references (& and *) , casting (via !!), custom data types (via !), loads of other things I don't understand.

9 comments

>loads of other things I don't understand.

also you can have multiple instances of yaml trees in one file, each one separated with -- . I think this makes it very confusing if used as a configuration language (they like to use yaml for configuration in kubernetes)

I mean sure but I’ve never seen any software actually use that feature in the wild except for “you’re allowed to concat your YAML files instead of separating them if you want.” Like I’ve never seen software require a certain number of documents in a file with different schemas.
i saw it being used in some CI system; it was very confusing.
It's a real shame JSON doesn't have 3 and 4, because it would be so easy and it's otherwise pretty much perfect imo. No ambiguity, every value type can be identified by its first character, clean and reasonably minimal syntax.
JSON5 is nice. I use it for all our configuration files at work after evaluating a large list of configuration file formats. I've never really run into any frustration using it, whereas YAML, TOML, and others drive me crazy when I need to represent nested structures or arrays.

https://json5.org/

well there are json alternatives which fit this bill, such as HJSON.

https://hjson.github.io/

might not be as "common" but it has good implementations for many languages.

you could allow python like comments and strip them with a regular expression substitution before parsing. Something like this. (it strips all the lines that start with whitespace, followed by #, followed by anything until the end of the line)

   cat cfg.json | sed -e 's/\s*#.*$//g' | jq .
Allowing multiple lines is more complicated, can't do that with regex alone.
There are lots of ways to implement it. The problem is that one of JSON's strengths is its ubiquity: every language under the sun has half a dozen different battle-tested parsers for it. Clients and servers and everything in-between have first-class support out of the box. You can even paste it directly into JavaScript as valid code.

If anybody short of a standards body tries to expand the spec, you lose out on most of that.

i think you can possibly define how you want to use json for a configuration file; json by itself is not much more than javascript objects/maps, defined as a data format. I frankly don't think that you need to be too pious about standard compliance if dealing with a cofiguration format for your application.
>json by itself is not much more than javascript objects/maps

And without comments or optional quoted keys single quotes or trailing commas, which makes editing by hand more work and IMO less readable.

Editors, for one, will be an issue
What about a key named “_comment”, or something similar? Of course, the underlying software must ignore unknown keys, so it’s not a full win anyway.
That might help for a general comment, but not for a comment on a specific part of the structure.
If Dhall was more popular, would it meet your criteria?
HCL does all of those.

TOML is pretty close as well.

HCL is neither acceptably readable nor does it just work everywhere. TOML is getting there though yeah.
I used to think that before I had to edit the ejabberd.yml config file. Now I think it's only remotely useful as a subset to use in Jelkyll headers.

Just use TOML for configuration instead.

> I used to think that before I had to edit the ejabberd.yml config file.

Hey, it's still better than having to write the configuration in Erlang like you had to before.

Besides the above (especially comments), YAML also has one huge advantage for configuration files and that is clean diffs.

JSON's lack of support for trailing commas messes up diffs.

If you come up with a format, call it IJW. It just works.
Use dhall and serialise to JSON/YAML!
My JSON parser ( github.com/nanoscopic/ujsonin ) has these things:

0. Looks essentially the same as JSON

1. Core data types, and customizable data types can be added easily.

2. Arrays, Objects, and arbitrary nesting.

3. Comments ( both /* */ and // format )

4. Multiline strings ( by default; carriage returns are no problem within strings )

5. It is JSON with relaxed restrictions and slight addition for actual named types.

6. I've written C, Perl, and Golang implementations so far.

I like the idea of what you're doing. May I suggest a Nim port? Nim outputs C anyway, but is safe, so you'd worry less about bugs.