Hacker News new | ask | show | jobs
by vidarh 712 days ago
I spent a lot of time on this too, many years ago, including working on a language designed specifically for it, and my conclusion was that I couldn't find a way to make it work without making the visual representation "just a tool" for working with a textual representation, because we still need to be able to communicate about code in all kinds of context where a largely visual system would mean being forced to share screenshots, as well as being able to handle e.g. diffs and source control. And once you still need to know and understand and work with a textual representation it feels like a lot of the potential evaporates, and that made me gradually lose interest.

I'd love it if someone found a solution to that, but it feels like an intractable problem beyond some specific use cases as you say.

I really hope I'm wrong.

4 comments

There needs to be some way of sharing diagrams, but hopefully not as screenshots? Maybe more like a higher-level SVG, where there is text and you can look at it, but it’s not the preferred way of viewing it. It would need to be easy to embed these diagrams in other documents, which suggests a standard format and viewers readily available, for browsers and editors at least.

Support for embedding such things in another language (Like we do with SQL and regular expressions) would be important, too.

That might be a place to start. Can we replace regular expressions with a visual language? What would it take to have “railroad diagrams” widely available in at least one programming language and many editors?

The problem is now you're reinventing the entire software stack of everyone you ever need to communicate with about it.

Consider how hard it is to even get people to use additional symbols, like the APL's do...

Yes, good example. On the other hand, emojis seem pretty popular, and many programming languages do support non-ascii symbols in identifiers. We just don’t use them much for actual code.

Along with markdown, there’s also increasing support for math equations in forums and blogging software.

Emojis have universal appeal. Coding symbols doesn't.

Math symbols are a better example, but also extremely well established, and it's still taken a very long time to get widespread support, and it's still far from universal.

But note that non-ASCII symbols etc. was an example of the difficulty of even getting support for something "that simple". Now try to take the step up to e.g a node-based editor.

That’s fine though? Ultimately everything you do on the computer is just a representation of bits being shuffled around. That doesn’t diminish its meaning or effective’s of a better user interface.
The problem becomes if the form you're used to working with the code in is very different from the form you need to communicate about it in.

In practice every attempt I've looked at either become hard to communicate about the code in, or the visual aspect tends to end up just becoming a secondary visualisation of code that you still treat as textual first.

In the latter case, turning it into a better UI is an unsolved problem, because round tripping reliably between something readable as text and a meaningful visual representation is really hard.

As I said, I hope someone solves this, but most attempts aren't even pushing the boundaries into uncharted territory - a lot of attempts have been made over the years.

The trick is that you don't round trip. You choose one immutable data structure that captures both the textual source of the program and the semantic information captured by the parser at the same time.
That doesn't solve anything. The problem isn't how to represent the AST, but defining both a visual and textual version that can unambiguously represent the same thing without either or both representational becoming unusable.
Right. I think it's already a solved problem though. HTML and the HTML DOM are a living solution to that exact problem. All that we have to do is take the patterns used to power general UI and develop a DOM for code.
The internal representation isn't the problem, or even a problem. It's not even beginning to address what I described.

E.g this is a real line of code:

    link = wf["links"].find{ _1["rel"] == "self" && _1["type"] == "application/activity+json" }
Now consider I have a visual programming version representing that expression, and I want to ask someones opinion about it on Slack. Unless your visual programming environment has a solution for how I can post that to Slack, and have others respond with tweaked versions, it's a non-starter.

Once you've solved Slack - maybe with a plugin -, you need to solve all our e-mail clients, and you need to solve Google Docs and Word for when we write documentation, and a multitude of other tools.

You might be able to get part of the way there with a browser plugin, but you'll still have a wide variety of other tools and platforms to cover.

Part of the solution is probably to rethink the whole UX context. One interesting example to think about is ProtoFlux, a part of Resonite:

https://youtu.be/70PH5cQQEdQ?si=y4YmhnimzferVpCD

Since you develop inside vr you can also talk about and show the code in the same vr environment.

It's probably at this level we need to rethink stuff to make visual programming practical.

But then you want to write about it and explain it to someone else, and you're back to needing a way to represent it that fits on a page or can be explained verbally.
ProtoFlux is largely explained through tutorial videos I think. And I'm not saying that is superior. But I can imagine a lot of quality of life features around explaining it through voice that could be added to make it potentially superior or at least competitive.

You could for example explain every node verbally and visually to an AI bot that can then explain it to the next person, or selectively retrieve parts of your explanations on demand. (OK, I realize that sounds unnecessary complex.)

Requiring me to watch videos to follow along is an absolute non-starter to me. It's way too slow. If it was just learning an environment maybe I could tolerate that, but the showstoppers is communicating about code for projects. An AI bot doesn't solve this - if you can't relay the information textually, there's no reason the AI bot will be able to.
Better to round-trip to the AST, and have the textual representation be derivative of that (e.g. by a code formatted like Go).

This also makes it easier to verify that you have all the same capabilities in both representations, as the ways of manipulating the AST are enumerable.

Many people have tried that.

I built a language and UI around that way back, and many others have. I ditched mine because there were way too many unsolved problems I felt made it useless.

The problem is that if your primary means of working with the code is visual, the textual representation of your code then tends to be foreign to you when you're trying to use it to communicate aspects of the code, and when you constrain yourself to something that can be represented in a readable manner in a textual form, it turns out to be really hard to get to a point where the visual form is easier to work with.

E.g. something as basic as how you comment code in ways that roundtrips nicely is an unsolved problem.

If I have code represented as a graph, I'd be inclined to want to label relationships and dataflows that would be hard to place textually in a way that is meaningful in a textual version and that would roundtrip back to labels in the right place in the visual version.

I've not seen any attempts at visual code that gets even that right.

I've not managed to get it right myself either. If you force users to use an editor built into this tool, and edit a textual representation where some information is hidden, you can do better, but then if people e.g. copy a textual representation of the code into another application and back in, you end up with a mess.

Again, I want to be proven wrong about this. Badly. I love the idea. I've just seen enough failed attempts (and made enough failed attempts) to be disillusioned about it.

Why would you want to work with the text representation, except when debugging or in the backend? I mean I get why you'd want the text representation to exist--we have mountains of infrastructure around text-based representations of code. Git for version control and LLM code models would work out of the box, for example. But that can all be handled on the backend by transpiling the AST to text as needed. Why would the user need to interact with the textual representation?

Commenting needs to be solved at the language level, and there are many languages that have solved this exact problem. Python, newLISP, and Smalltalk IIRC all have methods for docstring commenting APIs such that the docstring is available as text to the running program / REPL. Use similar syntax to allow any statement to have comments attached, and use this instead of free-form /* */ comments.

> Why would you want to work with the text representation, except when debugging or in the backend?

How would I communicate about the project to others in e-mails, instant messenger, face to face, in blog posts, in articles, in books?

How would I review diffs of code changes effectively?

That is why.

Find me a representation I can talk about and write about efficiently without screenshots or videos or requiring special software of every recipient on every platform, and you'll have advanced the state of the art in this field immensely.

> that have solved this exact problem

None of the ones you described have solved the problem of mapping between a visual and textual representation of the program seamlessly. Just attaching the comments from a textual version to an AST of the textual version is trivial. That's not the challenge.

> Use similar syntax to allow any statement to have comments attached, and use this instead of free-form /* / comments.

That doesn't get close to solving the issue. When I have a diagram showing the data flow of a piece of code, and I attach a comment to the edge* between the two nodes, in the textual representation where does that comment go? Does it go in the text version of the source node? In the destination node? What if I write a comment in the textual version right before a method call, and then switch to the visual version, does that stay in the source node? Does it become a label of the edge representing the method call? There are tons of edge cases there.

The problem isn't finding a way to attach the comments in the right place, but finding a way that roundtrips perfectly without adding noise in either representation.

I think diffs, snippets, and source control are crucial, but I think they only tie you to a textual representation if you are trying to use existing, text-oriented tools for them.

I grant that this makes the project significantly larger if you're going to remove that constraint.

I'm pretty sure I have a general solution to that problem. The key is that hierarchies can be embedded in text, as HTML and XML do. Check out https://github.com/bablr-lang/
I'm not sure how that is solving it? It needs to be compact, and readable.
I personally think CSTML is both compact and readable, though I understand why it would look the opposite the very first time you see it (and without any syntax highlighting at that).

Open up the source code for this web page: is it compact and readable? The answer seems to be that HTML is "good enough" in that regard, and I suspect CSTML will be the same especially as more developer tooling for it becomes available.

The problem is that I'm not talking about markup. Markup - unless it's so lightweight as to be near unnoticeable is not a solution. Unless every application you're transferring it via supports CSTML, the representation needs to be as compact and readable as a regular programming language.

To take your example, pretty much anything longer than [1, true, "3"] is a non-starter if someone is pasting it into Slack, or sending an e-mail. The CSTML representation isn't readable to them, and would take additional steps on both sides vs. just writing the source representation. I'm not going to tell people how to do something by writing it into some other tool and pasting some large blob into Slack or my e-mail client and expect the recipient to reverse the process.

That is the problem space. How you represent that as an AST isn't the problem - that's easy. How you represent it in a way that everyone can read and write and that "passes seamlessly" via existing tools is the problem.

(I must also admit that I think the choice of serialization format for CSTML is utterly baffling and feels like it adds a lot of NIH)

We're using markup to communicate right now and it's the overhead is small enough to be unnoticeable.

Try copy and pasting some HTML-formatted text from this page into your paste buffer.

I assure you that what goes into the buffer is HTML. You can verify this because if you paste the text into an HTML-embedding WYSIWYG editor (such as an HTML email composer) the formatting will be preserved. But if you paste the content of the same HTML buffer into VSCode, notice that you don't get the raw HTML but rather the textual content that was embedded in that HTML.

Now try pasting it into your terminal, or any of a large number of other tools that doesn't support HTML.

Now try doing the same with CSTML, in applications that so support HTML.

Now consider how little that markup contributed to the semantics of the text here - most of it can be stripped and the text retains it's meaning.

Then consider how long it took for HTML to percolate through these applications despite HTML - unlike CSTML- having universal utility.

And here's the thing: As someone with a history of writing compilers, parsers, language tools over 30+ years, CSTML is too verbose for me to want to use even for tooling. It's way too low level even as an internal representation for tooling.

It also still doesn't help: You still will need a compact textual representation anyway so people can represent it in contexts where the tooling doesn't exist, or can't exist, such as paper and handwriting, and speech.

All I can do is encourage you to try. If you succeed, great, and if not you will understand the difficulties involved.

I've tried the custom syntax representation (though I used XML which saved me from writing a custom parser) - it turned out to just be an obnoxious detour. I tried syntax aimed at removing ambiguity in round-trips, and it sort of worked but got too verbose. I tried a purely visual approach, and hence why I'm so insistent you need to be able to roundtrip to text. I spent years trying things and looking at others attempts.

I'd love to be wrong, but I very much don't expect any big breakthroughs in this area in decades - the attempts I keep seeing keep repeating all the same mistakes with few signs of lessons learned.