Hacker News new | ask | show | jobs
by justusw 2187 days ago
Very interesting. This PEP is still in draft state, but I am interested to see how the community will react. For me, I have a few thoughts:

1) This is really close to Erlang/Elixir pattern matching and will make fail-early code much easier to write and easier to reason about.

2) match/case means double indentation, which I see they reasoned about later in the "Rejected ideas". Might have a negative impact on readability.

3) Match is an already used word (as acknowledged by the authors), but I think this could have been a good case for actually using hard syntax. For me, perhaps because I'm used to it, Elixir's "{a, b, c} = {:hello, "world", 42}" just makes sense.

4) I hope there won't be a big flame-war debacle like with :=

5) And then finally there is the question of: "It's cool, but do we really need it? And will it increase the surprise factor?" And here I'm not sure. And again, this was the concern with the new assignment expression. The assignment expression is legitimately useful in some use cases (no more silly while True), but it might reduce the learnability of Python. Python is often used as an introductory programming language, so the impact would be that curricula need to be adjust or beginner programmers will encounter some surprising code along the road.

I can't say this is a good or bad proposal, I want to see what other opinions are out there, and what kind of projects out there in the world would really benefit from syntax like this.

3 comments

One difference I noticed from Elixir was this:

> While matching against each case clause, a name may be bound at most once, having two name patterns with coinciding names is an error.

  match data:
    case [x, x]:  # Error!
      ...
Which is a bit of a shame. This comes in handy in Elixir to say "the same value must appear at these places in the collection". I.e. for a Python tuple pattern `(x, y, x)`, `(3, 4, 5)` would not match but `(3, 4, 3)` would.

Overall, though, I think this will be a great addition to Python. Pattern matching is generally a huge boost for expressiveness and readability, in my opinion.

Although it won't be as concise, you can use guards to emulate this feature with

    match data:
        case [x1, x2] if x1 == x2:
            ...
I agree on this being a fantastic addition to the language. I've sorely missed not having pattern matching in Python after using Rust.
I think this is a better version. It is less ambiguous and more general. The [x,x] version isn't clear if it is using `__eq__` or `is`.
Guards feel superfluous if the x, x pattern can be trivially lowered into a guard.
More vice versa. Guards look more general (being able to express not-equal, and less-than etc), so special semantics for `x,x` only seem warranted if it's very common.
Agreed they're more general. If I have a working guard implementation I can build the x, x case atop it. Meaning, the x, x case can be sugar.
Which equality will you use if it's not numbers, but something more complicated.

What if `(x, y, x)` is matched against some `(a, b, c)` where `a == c`, but not `a is c`? If yes, and you mutate `x` in the body, does it mutate `a` or does it mutate `c`?

I think at least making it an error now can leave it open to defining it later.

There's certainly no reason to use reference equality. In fact, I think equality is the wrong question: it should duplicate the match. `case (x, y, x)` should `__match__((a, b, c))` IFF `a.__match__(c)` (Not sure if that's exactly the right syntax for the Python calls, but hopefully the idea is clear.)

(Elixir has a similar operation to "pin" in a match: if you use an existing variable as the pattern, but pin it, the value must match whatever the variable is already bound to.)

> I think at least making it an error now can leave it open to defining it later.

Fair point.

What's interesting here is this is gated on the new parser from PEP-617 [1]. Based on mailing list discussions [2], and the PEP, the use of `match` will be context-sensitive, so it shouldn't be as disruptive of an introduction as `async`.

1: https://www.python.org/dev/peps/pep-617/

2: https://lwn.net/Articles/816922/

I have a visceral dislike of pattern matching. Lisp shows just how much people will abuse it in real-world production codebases. It becomes impossible to understand even simple logic without comments. I’d link to some examples, but I’m on mobile; suffice to say, pull up the emacs codebase and read through some of the more advanced modules like edebug.el. I’m not certain that one uses pattern matching, but it’s a perfect example of “this codebase cannot be understood without extensive study of language features.”

You may argue that I am simply not versed enough in pattern matching. “You should study harder.” I would argue that simplicity is worth striving for.

I hope this PEP never moves beyond draft.

It’s also shocking that most people here seem to be tacitly supporting this, or happy about it. Yes, it’s cool. Yes, it might simplify a few cases. But it will also give birth to codebases that you can’t read in about, say, 5 years. And then you’ll have a bright line between people in the camp of “This is perfectly readable; it does so and so” and the rest of us regular humans that just want to build reliable systems.

And oh yes, it becomes impossible to backport to older python versions. Lovely.

A good pattern matching library in Lisp makes code a heck of a lot more readable.

Firstly, even basic list destructuring with destructuring-bind is an improvement over a soup of car/cadar/caddr/.

Suppose we are in a compiler and would like to look for expressions of he pattern:

   (not (and (not e0) (not e1) (not e2) ...))
in order to apply DeMorgan's and rewrite them to

   (or e0 e1 ...)
I would rather have a nice pattern matching case like this:

   (match-case expr
      ...
      ((not (and @(zeromore not @term)))
       `(or ,term))  ;; rewrite done!
      ...)
than:

   (if (and (consp expr)
            (eq (car expr) 'not))
            (consp (cdr expr))
            (consp (cadr (expr)))
            (null (cddr expr))
            (eq (caadr (expr)) 'and)
            (eq (cadr expr) 
      ... ad nauseum)
Even if a fail-safe version of destructuring-bind is used to validate and get the basic shape, it's still tedious:

    (destructuring-case expr
         ...
       ((a (b c))
        (if (and (eq a 'and)
                 (eq b 'not))
           ... now check that c is a list of nothing but (not x) forms
           )))

I don't have a pattern matcher in TXR Lisp. That is such a problem that it's holding up compiler work! Because having to write grotty code just to recognize patterns and pull out pieces is demotivating. It's not just demotivating as in "I don't feel like doing the gruntwork", but demotivating as in, "I don't want to saddle my project with the technical debt caused by cranking out that kind of code", which will have to be rewritten into pattern matching later.
People use pattern matching all the time in ML or rust or haskell or Scala or Elm. It's totally uncontroversial there, and it's helpful to readability, not harmful. Erlang and Elixir also show it works in untyped languages pretty well.
Python is famous for its readability. ML, rust, and Haskell are famous for their unreadability.
Apart from Clojure, lisps generally do not support destructuring pattern matching on an object/dict.
The defining feature of lisp is that it can support whatever you want, because the AST is available at compile time and is completely regular. If your point is about dicts specifically, then you might be technically correct, but I assure you that the majority of lisp codebases do support exactly the sort of pattern matching in this PEP. And the abuses are frankly egregious. Racket is the worst offender of them all, with syntax matching.
Trivia or Optima in Common Lisp support that.
I’ve never had this issue with lisp code bases, and I’ve read through quite a bit of lisp by now. For me, lisps—whether CL, Clojure or Emacs Lisp—are some of the easiest languages for me to identify and correct the source of a bug in.
Can you provide any examples?