|
I don't believe that I can change your mind on this, so I didn't intend to respond, but as this is the top comment, I do want to provide a rebuttal on why we do think this is actually a programming language, that the code we have written is actually a compiler, and why Marsha is a useful exploration of the programming language design space. First, a programming language is just a syntax to describe functionality that could be turned into an actual program. Lisp[1] was defined in 1958 but didn't have a full compiler until 1962. Was it not a programming language in the intervening 4 years? Marsha does not fall into this, since it can already generate working code, but the bar for what is a programming language, I believe, is lower than most would immediately think. Second, a programming language does not need to be imperative to be a programming language, or languages like Lean[2] that have you write proofs that the compiler then figures out how to generate the code to fulfill would not be programming languages. Lean, Coq, and other such languages are much more technically impressive than Marsha, true, but they share the property you describe the properties a function should have and then the compiler generates the program that fulfills those properties. Marsha differs from these Proof-based languages in that poor specificity still produces some sort of program instead of a compilation error, which makes it sort of like Javascript that will do something with the code you write as long as it is syntactically valid. This is not a desirable property of Marsha, but it is a trade-off that in practice makes it more immediately usable to a larger number of people than Lean or Coq, because the skill level required is lower. This is also, as you allude to, the current state of the world in most software development -- project managers come up with high-level requirements for new features, technical leads on engineering teams convert this into tasks and requirements for individual contributors who then write the code and tests which are then peer reviewed by the team as a sanity check and then committed. This process may or may not cover all situations and the specifications at all levels are likely not as rigorous as what Lean would require of you. Marsha mimics this process, starting from the tech lead level and bleeding into the individual contributor level. The type and function descriptions are analogous to the tech lead requirements and the examples are analogous to the test suite the individual contributor would write. Just like in real world development, if these are not well specified, the resulting code will likely have logic bugs that would need to be addressed with a stricter definition and improved test cases. The compiler consumes this definition into an AST[3], walks the tree to generate intermediate forms, and generates an output in a format that can be executed by a computer. Some use "transpiler" for a compiler that targets another language, but that is a subset of compilers, not a separate kind of tool, in my opinion, or the Java compiler would be a "transpiler" for the JVM bytecode format that is also not directly executable by a computer. We are still in the very early stages with Marsha and agree that more syntax could be helpful -- we already have 4 different syntactic components to Marsha versus the fully open-ended text entry behavior of Github Copilot or ChatGPT. But what makes Marsha interesting (to me) is that it makes it possible to explore a totally new dimension in programming language design: the formalization of the syntax to define a program itself. In many papers on new algorithms, the logic is often described in a human-readable list of steps without the hard specificity of programming languages, improving the ability of the reader to understand the core of the algorithm, rather than getting bogged down in the implementation details of this or that programming language. There is still a formalism, but it differs from that of traditional programming languages, and Marsha lets you work with your computer in a similar way. Are there cases where this is a bad idea? Absolutely. Just like there are cases where writing your code in Python is a bad idea versus writing it in Rust. There is no perfect programming language useful for all scenarios, and probably never will exist. But there will be a subset of situations where the trade-offs Marsha provides makes sense. By being more forgiving than even the most forgiving interpreted languages out there, Marsha is in a good position to fill that niche if the primary barrier is difficulty. [1]: https://en.wikipedia.org/wiki/Lisp_(programming_language)#Hi...
[2]: https://en.wikipedia.org/wiki/Lean_(proof_assistant)
[3]: https://github.com/alantech/marsha/blob/main/marsha/parse.py... |
A programmer is a human who connects the world of humans with the world of machines. To do this, he is required to sufficiently understand both worlds. On the human side this requires social competence and professional accountability, which machines don’t have. On the computing side, it requires at least that machines behave in predictable and comprehensible ways. Marsha appears to fall short on both counts.
Using an LLM for programming is inherently irresponsible. The people arguing in favor of doing so have not subjected LLMs to any kind of rigorous testing. They simply have unshakeable faith.
I am in the midst of a careful review and surgical takedown of a 9000 word demonstration of ChatGPT’s supposed ability to help testers test. It took maybe 20 minutes for some drooling consultant fan-boy to produce the demo. It has so far been about 30 hours of work to carefully pore through each sentence and show how it is wrong. I am doing the testing and critical thinking that the original consultant failed to do.
The Marsha site has a brief line about how it produces “tested” Python code. The one thing you can bank on with LLMs is none of you big eyed enthusiasts have a serious attitude about testing. It’s all simplistic demonstration.
I’m frustrated by this culture of fawning adoration of unproven and unprovable tools. I hope this trend peaks and become a generally acknowledged joke soon! Then we can resume with craftsmanship and responsible engineering.