Hacker News new | ask | show | jobs
by chongli 767 days ago
While this is a step up from C, it is still a long way from the full power and generality of algebraic data types. The key word here is algebraic. In a language with ADTs, such as Haskell, you can pattern match on an arbitrarily complex types, not just the outermost tag. A contrived example (from [1]):

    contrived :: ([a], Char, (Int, Float), String, Bool) -> Bool
    contrived    ([],  'b',  (1,   2.0),   "hi",   True) = False
To achieve a result like this using Zig's switch syntax would seem to involve a huge amount of boilerplate code and nested switch statements.

[1] https://www.haskell.org/tutorial/patterns.html

1 comments

This is more of syntax sugar than power and generality, since nested pattern matching can be mechanically translated into "top-level" matching (e.g., see [1] and [2]).

[1] L. Augustsson. Compiling Pattern Matching. In Functional Programming Languages and Computer Architecture, pages 368– 381, 1985.

[2] P. Wadler. Efficient Compilation of Pattern Matching. In S.L. Peyton Jones, editor, The Implementation of Functional Programming Languages, pages 78–103. Prentice Hall, 1987.

This is where I like to cite Dijstra's "Go To Statement Considered Harmful".

What a lot of people miss about that paper is that he wasn't just talking about goto statements. He was also making a more general observation about how more powerful and general programming language features are not necessarily desirable, because they tend to adversely impact developer productivity.

The reason I, as a user, prefer structured control flow statements over goto is not that I believe they are powerful. It's precisely because they are less powerful. The resulting constraints on how the program can be structured make it easier for me to read and reason about existing code. That makes maintaining code easier. It also makes optimization and static analysis easier. And it makes writing tests easier, too.

I have similar feelings about ADTs. The reason I prefer them to other ways of doing composite data types is not that I think they're more powerful. It's that they create constraints that tend to reduce the semantic complexity of the domain models people create in programming languages that use them. And that makes my job easier.

The corollary to that, though, is that I'm not actually all that hype about adding ADTs to existing languages. For reasons that are similar to how the mere availability of structured, reentrant function calls is small consolation in a codebase that's already riddled with goto statements. The real win doesn't come from using ADTs, it comes from not having to worry about all those other confusing overpowered things that aren't ADTs.

That's exactly where I am. Pattern matched sum types "feel great" to code in to an expert because they are a concise and reasonably tight way to express the otherwise boring "enumerate over possibilities" code.

But they're hard to read for anyone who isn't an expert on not just the language but the type in question (c.f. Rust's Option() idioms all looks like line noise to newbies, etc...). And that's a bad trade.

In essence, this stuff is just Perl all over again. It's a language feature that prioritizes concision over comprehension. And I say that as someone who really likes coding in perl. But "people" don't like perl, and the community moved on, and the reasons are... the same reason that uptake in ADTs is lagging where the experts want it to be.

You've got to distinguish syntax from semantics, though. I agree, it's easy to turn a big semantic win into a net readability loss if you choose to represent it with an overly terse syntax that promotes code golf.
Compilers are really smart. It's easy enough in the modern world to demand that a "type" field be checked against all enumerants, preserving the "semantic win". A plain old C switch statement with gcc -Wall will do this today, in fact.
Pattern matching on the "top level branch" of the ADT, whatever you call it, is pretty darned useful.

Pattern matching on the next level down is a power tool to be used with care.

Having used some pattern matching languages for quite some time, I find anything much deeper than that is a code smell at best and pathological at worst. Pattern matching creates coupling proportional to the depth/complexity/precision of the pattern match. The top-level coupling is often unavoidable; if you're pattern matching at all, you certainly care which "branch" you are on and there is likely no refactoring that away. But the danger rises rapidly the deeper in you go. It's just so easy to pattern match on a third-level part of the complex object when you really ought to be wrapping that behind a function somewhere, possibly itself emitting a sum type value.

... but if all you really need is that "top level" match, a lot of pattern matching syntax and features are not really necessary (if not positively dangerous).

> Pattern matching on the "top level branch" of the ADT, whatever you call it, is pretty darned useful. Pattern matching on the next level down is a power tool to be used with care.

Which is exactly how Perl apologia arguments went.

This argument is the most common fallacy I see in programming language discussions. I might as well give it a name right here: "Turing equivalence fallacy" or perhaps "syntax sugar fallacy."

All Turing Complete programming languages are Turing equivalent to one another. Programs written in one language can be mechanically transformed into those written in another. This is irrelevant to the discussion of programming languages. The whole point of creating different programming languages is to explore different ways to express the same program!

In programming language design, we tend to distinguish between global and local analysis. While type checking and elaboration is an example of global analysis, desugaring is inherently local to some piece of code. Therefore, "power" or "expressiveness" usually mean that something cannot be syntactically "expanded"; e.g., while type classes elaborate into explicit dictionaries, they still require information from the type checker, and therefore considered a "real" feature of a programming language. On the other hand, nested pattern matching can be formulated as local syntax transformation, and therefore it doesn't bring anything fundamentally new to the type system or dynamic semantics.

There's also a great talk on the matter [1], if somebody is interested in formalities.

[1] https://www.youtube.com/watch?v=43XaZEn2aLc

You're approaching this from a PL design standpoint where the distinction is important, but from a user perspective it doesn't matter if it's just "syntax sugar" or if it's a super complicated to implement all that matters is whether the feature is available or not.
Typing features affect the way we design APIs. Libraries written in languages with type classes and without them can have completely different designs. If nested pattern matching is not available, this will not affect the APIs, only the function bodies -- because desugaring is local by definition.
That doesn't matter in practice. If two programming languages have the same underlying feature but one has syntactic sugar to make it very easy to use and the other does not (so is quite cumbersome to use) then you'll find that the library ecosystem for the former language will see the feature in widespread use whereas the ecosystem of the latter will tend to shun the feature.

This is one of the social factors of programming language design and it's one of the main reasons successful programming languages work so hard to establish a coherent philosophy and a set of best practices or idioms within the language. For similar reasons, I believe this is why "anything goes" languages such as LISP have struggled to gain widespread adoption: with no philosophy every programmer becomes an island unto themselves.

Abstractly this is true, but software development is a human practice, so it matters not what's technically possible but what people actually do.

That's why the most important difference between C++ and Rust isn't some technicality even though the technical differences are huge, it's cultural. Rust has a Safety Culture and everything else is subservient to that difference.

Sugar matters, Rust's familiar looking loops are just sugar, it only "really" has a single way to do loops, the loop construct, an infinite loop you can break out of. But despite that, people deliberately write the other loops - and the linter strongly recommends that they write them, because the programs aren't just for machines to compile, they're for other humans to read, and a while let loop is an intuitive thing to read for example, so is the traditional for-each style iterator loop.

Of course the syntax sugar is a good thing if it makes it easier to write the code, but if the question is about "expressive power of the type system", it's not really relevant: Zig's type system can properly express a sum type. In addition: pattern matching is orthogonal to ADT, you can have pattern matching in both languages with and without algebraic types. Neither one implies the other.
> Zig's type system can properly express a sum type.

Surely any Turing complete PL can express a sum type? I can't imagine a language that can support products but not sums.

Turing completeness has nothing to do with static type checking. Dynamically typed PLs can't express any type (except Any) yet are still Turing complete.
> Turing equivalence fallacy

Better known as the Turing Tar Pit.

syntax sugar IS power. also if the mechanical transformation is nontrivial, then so is the power of the sugar
I once explored the Epigram dependently typed programming language. It used to preclude many types of free form pattern matches, due to heavy dependence on structured editor (you speciy a type, it generates pattern matching, you cannot change the structure), so it was almost completely unuseable for many, many tasks.

So, while you are formally right, the need of shortcuts in pattern matching is undeniable to me.