Hacker News new | ask | show | jobs
by zem 282 days ago
once you get used to it, visitors are a very pleasant way to write ast walking code in python. they are essentially generating your case statement for you, so instead of `case ast.Expr: handle_expr(node)` you just write a `self.visit_expr` method and have the visitor match the node type to the method name and call it.
3 comments

No it's not pleasant at all. It's boilerplate heavy, non-local and indirect. It's presumably a large part of why pattern matching is arriving in Python.
That's a lot of buzzwords to say that you enjoy shoving everything in one function. :)
In hindsight, I think your description is indeed better!
I guess that's subjective - I'm as big a fan of pattern matching as anyone, but when I was writing a type checker in python we made heavy use of visitors and it made the code pleasant to maintain.
Doing it this way maxes coupling and minimises cohesion.

Your language will have a number of phases/passes to carry out. Let's say LambdaLifting, TypeChecking and Inlining.

All the code for lambda lifting belongs in one module, all the code for type-checking in another module, etc.

If you instead use visitor pattern, you will be looking at all the code related to Variable, Function, Literal in those files respectively.

So when you're working on Function.typecheck(), it will sit in source code just under Function.lambdalift() and just above Function.inline() - things which you don't want to consider together. Meanwhile, you'll need to switch between source files to work on Variable.typecheck() and Literal.typecheck().

> If you instead use visitor pattern, you will be looking at all the code related to Variable, Function, Literal in those files respectively.

I've never organized visitor pattern code that way. Usually it's something like:

  TypeChecker
    visitFunction
    visitVariable
    visitLiteral
  PrettyPrinter
    visitFunction
    visitVariable
    visitLiteral
So related functions (across the types you're visiting) are kept together, you're not revisiting the Function module to add a new visitor there. That would almost defeat the purpose of the pattern.

https://en.wikipedia.org/wiki/Visitor_pattern - See the UML diagram here.

That's not the only difference - the other issue is that you lose the stack, and must rely on member variables instead.

If you search for pop(), you can see that

    self.expected_ret.append(ret)
    self.generic_visit(n)
    self.expected_ret.pop()
and

    self.push(narrows_true);  [self.visit(s) for s in n.body];  self.pop()
    self.push(narrows_false); [self.visit(s) for s in n.orelse]; self.pop()
In the functional style, you just pass a param using the stack, rather than using an explicit stack.

It's not so bad here, but with a big enough language, and more complicated algorithms, the mutable member variables basically become "mutable globals".

And if you re-call visit() at arbitrary depths, IMO the algorithm gets obscured.

---

That said, I agreed here that visitors are useful when you need to say traverse all string literals in an AST, at arbitrary depths: https://lobste.rs/s/jdgjjt/visitor_pattern_considered_pointl...

---

A sign that this issue isn't settled is that two of the more complex type checkers make opposite decisions

- MyPy uses visitors extensively - https://github.com/python/mypy/tree/master/mypy

- TypeScript mostly uses switch/case functions - https://github.com/microsoft/TypeScript/blob/main/src/compil...

I'd be interested in analysis of why that is, but I suspect it's mainly style

yeah, pytype used a mix of visitors and if statements (we were trying to retain 3.8 compatibility for a while so we didn't switch to `match`), depending on what fit various parts of the code best. it wasn't a particularly dogmatic "we will use visitors because that's the one true design pattern" thing, just that some problems fit the pattern neatly.