Hacker News new | ask | show | jobs
by chc 4440 days ago
I don't think the problem stops at syntax. It's possibly an even bigger issue that mixing different language semantics can be awkward. As a big obvious example, a language where all objects are nullable will interface awkwardly with one that only has option types. Similarly, interfacing with something like Smalltalk (which uses methods for flow control) or Forth (which…is Forth) would be awkward from a language that's more like C++.

Even in an environment like the JVM which specifies a lot of stuff for you, it's awkward to call into Clojure from Java because of the semantic differences.

1 comments

I wasn't implying there is no problem with the semantics, just that it's much easier to deal with when you already have the parsed trees, because they're easier to reason about with code - and we can project them unambiguously.

We already do write tools for such language interoperability for specific pairs of languages, which is often really awkward because it requires us to re-implement the parsers, and only deals with entire code files rather than specific productions in the syntax.

It's pointless composing languages unless it makes sense semantically, which would need to be done on a per-language basis (or per-production rule), which is where I was hinting with using Haskell as the glue for such interoperability - because if we encode the semantics into the type system, such that one syntax expects a language box of type T in it's grammar, then one should be able to use any other language whose parser returns a T, and the semantics will be well-defined for it.

It could also provide the glue for converting between nullable types and option types for example too, by requiring that a language returning a "Nullable T" be wrapped in some function "ToOption", which converts "Nullable T" into "Option T". Attempting to use the Nullable where an Option is expected would fail to parse. How ToOption is implemented is left to the author of the code.

It's much easier to have interoperability between individual production rules in different languages (which share many parts in common) versus "whole text files" which we currently have, which basically require the languages be almost equivalent to convert between them.

Also as a result of storing the semantic information as opposed to sequential text, it would be possible for the user to chose his preferred syntax for any semantic elements in the tree, since they're just working on a pretty-printed version. Most of the concerns about "code style" disappear because they're detatched from the actual meaning that is stored.