Hacker News new | ask | show | jobs
by chubot 3407 days ago
I'm sure it's as old as dirt, but I don't think there is a good name for it. "Concrete Syntax Tree" is not a good name for the reasons pointed out in the article.

Do you have reference for this? I saw this problem mentioned in the write-up on the ZINC Abstract Machine by Xavier Leroy (author of OCaml). But I don't know of other papers that talk about this.

Also I believe that most open source tools do NOT have this functionality. Look at lib2to3. It's bolted on -- not exactly a clean design. Most open source front ends are not designed for tooling like Clang is.

2 comments

The ANTLR folks call it a "parse tree".
Please read the article. I specifically mention ANTLR, parse trees, and why the lossless syntax is not a parse tree / concrete syntax tree.
My source is in-person conversations with a real-life human being, and working on one of his codebases that employed the technique. If you want to look up his work, his name is Bill McKeeman. I personally have never felt compelled to find secondary sources when I had primary sources.

Edit: I'm personally not surprised that open-source codebases don't employ this technique. Lots of great PL work was done for private companies until the 90s and while lots of work was published in papers and books, precious few open-source PL communities historically drew from academia. I'm sure you know the counter-examples.