Hacker News new | ask | show | jobs
by digdugdirk 407 days ago
Cool! Out of curiosity, what's the bedrock that's used to determine what the fundamental python AST objects are? I'm wondering what the "single source of truth" is, if you will.

Is this all based off a spec that python provides? If so, what does that look like?

Or do you "recode" the python language in rust, then use rust features to parse the python files?

Regardless of how it's done - This is a really fascinating project, and I'm really glad you guys are doing it!

2 comments

There is a formal grammar defined in the CPython repo, implemented in a language called ASDL:

https://github.com/python/cpython/blob/main/Parser/Python.as...

ty uses the same AST and parser as ruff. We don't use the ASDL grammar directly, because we store a few syntax nodes differently internally than how they're represented upstream. Our parser is hand-written in Rust. At first, our AST was also entirely hand-written, though we're moving in the direction of auto-generating more of it from a declarative grammar.

https://github.com/astral-sh/ruff/issues/15655

https://github.com/astral-sh/ruff/tree/main/crates/ruff_pyth...

https://github.com/astral-sh/ruff/blob/main/crates/ruff_pyth...

ditto! but we gave impressively non-overlapping answers
As in, how are we parsing the Python code into an AST?

CPython uses a generated parser. The grammar is defined in https://github.com/python/cpython/blob/main/Grammar/python.g... which is used to generate the specification at https://docs.python.org/3/reference/grammar.html#full-gramma...

We use a hand-written parser, in Rust, based on the specification. We've written that previously at https://astral.sh/blog/ruff-v0.4.0#a-hand-written-parser