Hacker News new | ask | show | jobs
by emn13 2019 days ago
Yeah, there's something to be said to a format that makes it hard to shoot yourself in the foot; essentially. That point is somewhat orthogonal to the issue of how easy it is to author a config value, however.

By the way, you conflate purity with turing completeness; but the two are not really all that strongly related. It's possible to have a turing incomplete language that is nevertheless impure (public I/O without unconstrained repetition), and conversely a turing complete language that is pure (i.e. keep your tape private).

I'd argue that turing completeness isn't as relevant as people make it out to be here. It's not a good thing, mind you, but it's just not that problematic either; externally imposed termination and storage limitation can render any turing complete system into a turing incomplete system - that's easy - but a system with uncontrolled sideeffects is almost intrinsically hard to manage. In fact, even technically turing-incomplete systems may well need to impose similar limitations anyhow, because a technically turing incomplete language that allows (say) nested loops or iteration - albeit bounded - may well not practically terminate, or nevertheless cause too much I/O. Some languages are really limited, and perhaps then you can get away without externally imposed resource constraints, but it's not clear to me how realistic that scenario is.

The real problem (to my mind) in general-purpose languages when it comes to using them for config-specification is not turing completeness, it's purity (i.e. reproducibility). And that's not even really a language issue alone, it's because those languages tend to come with large, pervasively used libraries, to the point that it's not trivial to just take some code off stackoverflow (say) and reliably tell whether it's pure or not - because that depends on the internals of all of those library methods too.

1 comments

> and conversely a turing complete language that is pure (i.e. keep your tape private).

A Turing complete language can't be pure because there is no value that is equivalent to a nonterminating computation.

That's irrelevant right? The point is that it's reproducible. Whether you define purity as to include non-termination or not is besides the point; the point is to avoid side-effects. Lack of side effects matters in the context of configuration, non-termination does not (and see the thread you're replying to for an argument as to why that is). That's kind of the whole point of the argument.
> That's irrelevant right? The point is that it's reproducible.

It's not reproducible if it's not a value. The point of a pure function is that you can replace it with the value that it evaluates to.

If you include nontermination as a value in your language then your language becomes almost impossible to reason about as you break almost every equivalence property you could think of. E.g. you can no longer say x * 0 = 0.

> Lack of side effects matters in the context of configuration, non-termination does not (and see the thread you're replying to for an argument as to why that is).

I don't find "but terminating code may still take a long time" to be a convincing argument that nontermination isn't important; rather it's an argument that code taking a long time might also be important (at least to the extent that it actually comes up in practice, which I'm not convinced of).

I think it's pretty reasonable to say that technically you might not be able to equivalently replace x * 0 by 0. Note that if it's replacable, it's reliably replaceable. This is essentially how pretty much all functional languages work incidentally - functions may not terminate, and in theory that can cause issues, but in practice not so much. Part of the saving grace here is that:

(A) - you're exceedingly unlikely to run into this issue in the first place. Nontermination limmits aren't set to things you're likely to hit without a runaway loop or recursion.

(B) - when you do hit the a forced termination issue, such replacements are usually irrelevant, i.e if your algebraic rewrite doesn't affect the recursion or loop it won't affect bail out either. Depending on how you implement forced termination, you can likely guarantee this, but it's not very valuable to.

(C) - The alternative isn't real if you allow theoretically bounded loops but with high limits. You can decide not to specify forced termination, but that doesn't mean you don't have it; it simply means the OS or user will terminate the process instead, and reasoning about that is much, much harder. A system with small limits is possible, but those are much less practical to start with. And there have been quite a few systems over the years that tried to impose such limits by design and then it turned out that there were escape-hatched that could be abused to nevertheless impose huge load (if you can do any kind of doubling in an iteration, you don't need many to cause denial-of-service).

(D) - Although it's possible an algebraic rewrite could affect termination, the scope for this is pretty constrained. Either it works, or the whole system fails to terminate; there's no middle ground. That means if you simply assume termination will occur and deal with the code as if it were pure, you'll either end up with a functioning system, or a clearly non-functioning system, but without corruption or any unacceptable uncertainty. (It's possible to shoot yourself in the foot here, but I don't it's possible to do so accidentally).

I mean, if you want to make the argument that all of this is tricky - yes, it sure is; and there are a few risks and some complexity to all this! But simultaneously, I don't think you're going to do a lot better if you want the kind of flexibility that recursion and looping allow. These complexities are pretty manageable in practice; the risks limited. And if you need even tighter guarantees you're going to need to lose most recursion and loops, likely even bounded loops. I've never used it, but earlier in this thread somebody mentioned starlark - and while they clearly tried to avoid turing completeness (loops are bounded, and no recursion by the looks of it), they do allow nested loops with large bounds; i.e., given whatever time you think your OS or user will be willing to wait before pulling the plug you wont hit those bounds: in terms of reasoning, you cannot rely on termination.

But I think that's a decent trade-off. The restrictions needed to reliably terminate in some bounded amount of resources are just too onerous to leave room for a language that can come close to one that does not have those restrictions. As such, it's fine to have either a deterministic, side-effect free language with the risk of (practical) non-termination, or a language with very, very limited looping (e.g. no nesting, and perhaps only constructs like map as opposed to iterating over ranges) - but not much room for anything in between.

Again, the context is the kind of languages you might consider for configuration. And in that context I don't think that turing completeness is all that relevant, compared to determinism and no side-effects, assuming termination. It's those latter aspects that really have a huge impact, and termination mostly in theory, not in practice.