Hacker News new | ask | show | jobs
by JimmyM 3550 days ago
What's wrong with writing JSON by hand? I've never touched YAML before, but JSON seems pretty clear to me - doesn't feel much different to writing a list naturally.

Is YAML one of those things that proper professional programmers need but us amateurs can botch our way around?

7 comments

The single fact that JSON doesn't support comments makes it pretty bad choice for configuration files, where you often want to document why a specific value was chosen for a setting.

Yes - I know that some JSON parsers will allow comments and strip them, but IMHO you shouldn't rely on this, and lots of editors will complain if they encounter any non-standard JSON.

this is my single largest complaint about JSON... I NEED comments.
I think the idea is that the format lends itself towards being human readable.

If the names/values are chosen well their purpose will be self evident.

Names/values can tell you the what. Comments can tell you the why.
These aren't dip-switches. Description of the logic behind the configuration settings doesn't need to be embedded into the config file. It can be in the manual.
I think you're still missing the point. It's not the description of the logic behind them.

Comments are useful for conveying why a particular value was chosen in this particular config file by some person. For example:

    # This is temporarily disabled until TICKET-432 is fixed.
    # It should then be turned back on.
    feature-that-should-usually-be-enabled: false
Agreed this is helpful when reverse engineering something you don't have documentation for.

But, this is still information that shouldn't be embedded in the config file.

TICKET-432 should say "feature-that-should-usually-be-enabled is set to false while this issue is active. When this is fixed, set it back to true."

> If the names/values are chosen well their purpose will be self evident.

I don't see how. If I pick a particular value for a config setting, it's obvious what value was chosen, but there's nothing to suggest why that value was chosen.

This reminds me of a a terrible programmer I knew when asked why he had no comments in his code: "inspection of the code should be sufficient".

Not for your code it wasn't.

It takes many years for people to figure out what the right level of commenting is. It's more of an art than a science. Worse, the level of comments depends on the reader. An old veteran may find one distracting that a beginner finds extremely helpful. But they can also be a liability if they aren't maintained with the code or if they make statements about other code that fails to be true after awhile.

A comment like:

# Add 1 to the length of this buffer to work around an off by 1 error in this function in library foo

Can quickly go stale, but sometimes not and could otherwise be accidentally reverted by someone who notices that the buffer is 1 element too long for no apparent reason.

A better comment:

# Add 1 to the length of this buffer to work around an off by 1 error in function foo from library bar (version 1.7.3b circa Nov 1997)

The YAML spec is huge, too large IMO for config files (you can use it while knowing only a subset, but you'll be lost as soon as someone uses a feature you don't know).

I think TOML strikes a good ballance between simplicity and features for config files. It ends up being easy to read and write.

https://github.com/toml-lang/toml

I wrote a layer around an existing parser that refuses to parse anything except that subset that most people use 99% of the time:

https://github.com/crdoconnor/strictyaml

IMO TOML is syntactically messy, especially when dealing with hierarchical data, and a whole new config format to deal with the fact that YAML has too many features is somewhat unnecessary.

I agree. You often want to parse and/or generate configuration files programmatically. For these cases it's good if parsing and interpretation of the file format can be easily implemented (or is already implemented in high quality). YAML has a quite big featureset and definitely doesn't fall into the "easy to parse" category. I'm also quite happy with using TOML for configuration files for these reasons.
Nod.

TOML is really nice for configuration files.

JSON for configuration is a cute idea that doesn't scale at all. I've personally experienced this in a project at my job. Once the config gets substantially large, it becomes a headache. The real pain is when a sysadmin without any experience in JSON screws up the formatting or tries to change the config and can't decipher the parse errors.

JSON for config is a bad, bad idea - especially if your config will get large.

No comments, too tedious to write all the delimiters (mainly string quotes, but lists and hashes add to that), and trailing comma is disallowed, so moving list or hash elements around needs more attention.

> Is YAML one of those things that proper professional programmers need but us amateurs can botch our way around?

I really don't understand this attitude. If you already call `json.load(some_string)', there's no difference to change to `yaml.load(some_string)' and go with that. YAML data model is similar to JSON.

JSON is a format mainly for machines, and is somewhat readable for humans. YAML is the reverse: it's mainly for humans, but can be processed by machines.

Tons of useless {} and quotes, useless in a configuration file, I mean. Just rewrite an nginx configuration file in JSON and you'll notice the difference even if nginx has its share of {} and ; Do the same exercise on docker compose yml file.

Actually YAML should be easier to amateurs than to professionals. Pros have tools that deal properly with XML, JSON is a lesser problem.

[{[[{[{[{[{}]}}}]}]]}]

Spot the typo!

If you think this is stupid because surely everyone uses an editor with automatic scope highlighting etc, I don't think that is the case when you are remotely editing config files via SSH.

If you work on a team with someone who writes config files like that, I think you've got bigger problems.
vim works via SSH. :)
If you are remotely editing config files, then that is the problem.
In practice, not everybody is managing 10+ identical servers, and manual config + backups works well enough.
I have dozens of small web projects. Part of my project boilerplate is a Fabric file that runs tests and deploys. There's no reason to hand-edit a file on the server unless something is already on fire. And even then . . .
I assume that you already have Apache, your firewall, etc etc set up on the server though? You likely manage OS package updates somehow outside of Fabric?

A fair bit of my stuff co-exists with other things on the same server, so a per-project deployment system couldn't manage everything without occasional conflicts.

Personally I deploy everything in Docker containers now, so I test a static image and push a bit for bit identical image. (EDIT: Even for individual / one-off setups, yes)

But even if you want to do "manual" changes without Docker, or without a configuration management solution like Puppet etc., I'd do them locally in a git repo or similar and either git pull'ing it or rsyncing it over. Both because it'd mean flexibility in terms of tooling, but also because it makes it easy to actually test it first, or at the very least e.g. syntax check them.

Poorly worded by me - it doesn't really matter how hard it is to write - it does matter how compact and readable it is - how long it takes to get an overview of what's wrong.

It's nice if it's easy to write - but the real test is read, comprehend, modify.

JSON merges the worst and most redundant parts of C and lisp.