Hacker News new | ask | show | jobs
by skrebbel 3215 days ago
There are editors that do this. Notable, Jetbrains MPS comes to mind. It looks like a text editor but once you use it you quickly notice that you're actually editing the abstract syntax tree directly.

It's cool, but it has some major downsides too. For example, MPS stores the source as XML, not text (since it isn't text, it's a tree). This makes lots of basic tools we've taken for granted a lot harder, such as git merging etc. They've had to make a custom mergetool just to make basic collaborative coding feasible.

I bet there's other ways around that, all I'm saying is that text has major, major upsides because of the enormous ecosystem support.

4 comments

> For example, MPS stores the source as XML, not text

If only there were a way to write your code using a uniform tree syntax in the first place...

>It's cool, but it has some major downsides too. For example, MPS stores the source as XML, not text (since it isn't text, it's a tree). This makes lots of basic tools we've taken for granted a lot harder, such as git merging etc. They've had to make a custom mergetool just to make basic collaborative coding feasible.

Doesn't solving this just require a text-to-AST, AST-to-text input and output step?

Anyplace outside the editor the programmer just sees regular text.

The issue with this is that operations on text don't necessarily preserve a valid AST. Doing `git merge` on the plain text of a source file may result in invalid code, at which point you have other annoying questions to answer about how to handle text that doesn't parse into a valid AST.
It would be nice to have a VCS that could work on the native MPS data structures.

Text is probably the next biggest mistake in programmer productivity after null.

But why? Tools can parse text just fine, and create its MPS structures in the background to do whatever it needs. Why expose this to the programmer?
Rich structured editors free one from text and are able to encode other information that is not currently recorded in text formats. Directly operating on structures would free languages from parsing, correctness checking and compiling could occur at every semantically correct operation.

With a rich structure editor that can do merges, the undo history of edit and refactor operations could be persisted and merged into the VCS. Currently this isn't possible. Text is a projection for the page and a lowest common format.

Directly operating on structures would free languages from parsing, correctness checking and compiling could occur at every semantically correct operation.

Directly operating on structures would mean that you'd have to write an editor, which had to enforce correctness as well. And then you'd have to write a generator to save those structures in some kind of format that could be written to a file and passed around, and a parser to read such format. And check for correctness again, since who knows what generated that file.

As for constant compilation, that already exists, many IDEs have it. That's because parsing text is not actually hard, the other stages are.

With a rich structure editor that can do merges, the undo history of edit and refactor operations could be persisted and merged into the VCS. Currently this isn't possible.

Of course it is, you could write a plugin for any IDE that would record edits and refactor operations and save those in or alongside the text (much like they've have to be save alongside the AST). Of course, that doesn't help if the user does a manual refactor, but that's no different than they choosing a node in the rich editor, deleting it, then manually recreating it in its refactored form.

Instead of retrofitting a structured format on top of the current text centric world, can we imagine if a structure centric world would be better? A large number of tools would exist to operate semantically on the same structured format, including editors, versioning systems, grep, etc. Diffs and merges would work better. Languages would define the syntax in terms of a tree input instead of text input, and so on.
We could, I'm just not convinced it would actually be better than text. Parsing is not a difficult problem.
I thought about this a while ago, it would be nice if you could have all code stored as the AST and the formatting handled on ingestion/export.

It would finally settle the formatting arguments and all the tooling would be able to leverage each others projects.

Even PHP has an internal AST representation these days.

Take a look at smalltalk where you edit the live in-memory objects directly.