Hacker News new | ask | show | jobs
by solatic 2353 days ago
> General purpose programing languages are getting more expressive by the day

You know, once upon a time, we understood that declarative approaches to software engineering were superior to imperative approaches, when declarative approaches are feasible. Declarative approaches are much safer and easier to test, at a cost of only being able to express what the tool accepting the declarative approach can understand. Imperative approaches are strictly worse for any problem set where a declarative approach solves the problem within performance requirements. The additional expressiveness of languages like Pulumi is the last thing I want.

YAML is a horrible language for declarative system configuration because a) any sufficiently complex system will require you to generate your declarative codebase in the name of maintainability, b) generating code for any language where whitespace is significant will lead you to an early death, and c) stringly-typed languages are fundamentally unmaintainable at sufficient scale. But this is not an indictment of a declarative approach! It is an indictment of YAML.

> Configuration is code not data.

Data > code. Data does not need to be debugged. The best code you can have is deleted code - deleted code does not need to be maintained, updated, or patched. Code is a necessary evil we write in order to build operable systems, not a virtue in and of itself.

4 comments

I use Lua for configuration files. It's easy to restrict what you can do in Lua (I load configuration data into its own global state with nothing it can reference but itself). Plus, I can define local data to help ease the configuration:

    local webdir = "/www/site/htdocs"

    templates = 
    {
      {
        template = "html/regular",
        output   = webdir .. "/index.html",
        items    = "7d",
        reverse  = true
      },
      
      {
        template = "rss",
        output   = webdir .. "/index.rss",
        items    = 15,
        reverse  = true
      },
      
      {
        template = "atom",
        output   = webdir .. "/index.atom",
        items    = 15,
        reverse  = true
      },
    }
When I reference the configuration state, templates[1].output will be "/www/site/htdocs/index.html". And if the base directory changes, I only have to change it in one location, and not three.
I think "declarative" is a bit of a red herring here. Deterministic/reproducible/pure is a more appropriate distinction: configuration languages like JSON/YAML/XML/s-expressions/etc. are trivially deterministic, but not very expressive, leading to boilerplate, repetition, external pre/post-processing scripts, etc.

Allowing computation can alleviate some of those problems, whether it's done "declaratively" (e.g. prolog-like, as in cue) or not (e.g. like idealised algol with memory cells).

The main reason to avoid jumping to something like Python isn't that it's "not declarative"; it's that Python is impure, and hence may give different results on each run (depending on external state, random number generators, etc.). Python can also perform arbitrary external effects, like deleting files, which is another manifestation of impurity that we'd generally like to avoid in config.

tl;dr The problem isn't the style of computation, it's the available primitives. Don't add non-deterministic or externally-visible effects to the language, and it wouldn't really matter to me whether it's "declarative" or not.

That's a bit of a no-true-scotsman there. If the problem is just the markup of choice, we should see an alternative markup emerging any time now. If we see imperative-focused tools instead, maybe it's not just about the markup.
We do see alternative "markups", if you want to call them that, emerging that solve the generative issues - the two that come to mind are Dhall and CUE. They bring forth JSON to help them interoperate and be relevant in a world that predominately expects JSON/YAML, but they can also be read directly.
There are declarative general purpose programing languages.

That data you are talking about does need to be debugged, like Helm charts and pipeline definitions. Sure data is better, but config is code, not data.

Generators need to be debugged, not data. It's very easy to test a generator - a few unit tests checking whether, for a given input, the generator produced the expected output, and you're set. Data sometimes needs to be cleaned, but there's no such thing as a bug in data.

Whether the generated declarative output produces the expected behavior on the part of the tool interpreting the declarative output is part of the tool's contract, not the generator or the declarative output. If you need to check the tool's behavior then either a) you wrote the tool or b) you're writing an acceptance test for the tool, which is an entirely different endeavor.

Things like pipeline definitions and helm charts are generators.
No, Helm uses charts (data) to generate object definitions (in YAML). Helm is the generator.

There's nothing that prevents you from writing a unit test that runs `helm template` directly to check whether a given chart with given values will produce a given set of YAML files.

>but config is code, not data.

Config is both. Config variables are data. The code that accesses and uses those variables is...well...code. they should be kept separate. Like any other code and data. Config isn't a separate special entity, it's just another part of the program. The data part should be represented as such and the code part should be code. Trying to combine them and create a special 'config' language is mistake.