Hacker News new | ask | show | jobs
by henrikschroder 3849 days ago
> I think AST and semantic-analyzer are going to play an increasing role in a variety of software development activities

It would be fantastic if source control systems would work on the AST instead of the plain text files, so many annoying problems could be solved there.

3 comments

There are simpler and less intrusive ways to solve outside-AST issues (like an automatic reformatting pass before each build or each diff)

What you're suggesting raises a bunch of new (non-trivial) issues:

- What would you do with code comments? Things like "f(/+old_value+/new_value)".

- How to store code before preprocessing (C and C++) ?

- How to store files mixing several languages (PHP, HTML, Javascript) ?

- How do you store code for a DSL?

> - What would you do with code comments? Things like "f(/+old_value+/new_value)".

Comments are included in the AST, the AST should be reprojectable into canonical plaint text.

> - How to store code before preprocessing (C and C++) ?

This could get tricky, punt. cdata

> - How to store files mixing several languages (PHP, HTML, Javascript) ?

Same file format, different semantics. PHP is a DSL.

My new language manifesto includes having a mandatory publicly defined AST.

If it is reprojectable into canonical plain text, it's not really an AST - just an ST.
If version control stored refactorings (tree operations) against the AST it would open up a whole new world of possibilities.
What kinds of problems you talking about.
One really obvious one is merge conflicts in the following pattern:

    #old_file.py
    ...
    def func_before(*params):
        do_things()


    def func_after(*params):
        do_other_things()

    ...
If someone adds a function between func_before and func_after, and their coworker adds a different function also between func_before and func_after, you get a merge conflict because the line-based VCS doesn't know how the new functions should be ordered (or, potentially, interleaved ;) ). An AST-aware version control system could[1] realize that the order of function definitions is meaningless, and then wouldn't need to ask for help resolving the conflict.

[1] This isn't always true, so this might still involve some degree of being specialized-to-the-language. Or it might just mean that the AST the VCS worked with allows for fairly complicated types, and can distinguish between order-sensitive things and order-insensitive things.

Function definition order is certainly meaningful, at least to a human reader. Suppose co-worker 1's function is related to func_before, and co-worker 2's function is related to func_after, and the VCS flipped them around.

Or alternatively, I go in by myself and re-arrange the order of functions in a file to improve the clustering. If the VCS thinks the order of functions is meaningless, it won't recognize the change.