Hacker News new | ask | show | jobs
by CyberDildonics 528 days ago
Everyone understands the concept. Understanding why you shouldn't do it is what takes experience.

If you look at the source code for doom it is very straight forward. No fancy stuff, not cleverness, no pageantry of someone else's idea of what "good programming" is, just what needs to happen to make the program.

I'll even give you an example of an exception. Most for loops in C and successors are more complicated than they need to be. Many loops are looping from 0 to a final index and they need a variable to keep track which index they are on. Instead of a verbose for loop, you can make a macro to always loop from 0 and always give you an index variable, so you just give it the length and what symbol to use. Then you have something simplified and that's useful. It's shorter, it's clear, it will save bugs and be easier to read when you need nested loops through arrays with multiple dimensions.

I already gave examples before where clever extra syntax creates an exceptional situation but gains nothing.

The fundamental point here is that these opportunities are rare. Thinking that making up new syntax is a goal of programming is doing a disservice to everyone who has to deal with it in the future.

1 comments

You are 100% right in all, except you are talking about syntax extensions (the for example) and not DSL. A DSL does not need a new syntax, is a collection of abstractions that allow to express problems in the language of the domain problem. It is not an API, because is not an interface for a functionality. Is not about exposing functionality, but to add semantic value to the upper layers. May (not necessarily, but may) be formed by a collection of functions, in that case similar to an API, in that sense. Sometimes may include indeed extensions to a language, but in that case by the standard means of abstraction preferred in that language: clases, templates, functions, structures. The key is to reduce the cognitive load for the end programmers, who could be expert in the problem at hand, but not in the underlying language of the embedding.

There is also the possibility of embedding in a non programming language, like XML (E.g. launch language in ROS), or S-exp in the Oracle listener config file. Also you can do ad-hoc like in the .msg files of ROS. But is always about semantics, not syntax. Syntax is the medium only.

Is not about exposing functionality, but to add semantic value to the upper layers.

Sometimes may include indeed extensions to a language, but in that case by the standard means of abstraction preferred in that language: clases, templates, functions, structures.

You keep saying that there are no problems and that it isn't like anything mentioned but you don't have any examples.

What is an example of "adding semantic value" that isn't using the languages normal constructs but is still not something someone needs to learn and memorize?

You said a DSL has to have its own syntax, or have to change the language and it implies more cognitive load. That is just not the case, as stated with sources like SICP and Wikipedia.

The whole idea of a DSL is exactly to avoid learning something new. Of course there will be some piece of information to be learned, but what are we comparing against? Is there a solution where somebody does not need to learn absolutely anything? Of course not! You have to learn something, to be able to use it, the question is how to minimize the cognitive load.

You are right it would help some example, I have a couple in which I recently worked on:

1) We had a very complex ASIC which had a complicated way of configuring it: there were RF parameters and also a program that runs in the ASIC; say “repeat 20 times {send, receive, analyze, phase-shift}” of course the real thing is much more complicated. Now the ASIC manufacturer gives an API for doing everything, which involves setting registers, flags, internal state machines, etc. we have an expert that knows lots about RF and the application, but is weak in programming. We did it in lisp, but I will try to explain like if it was C: we made a bunch of functions, lots are very API like, setters and getters. But to program the sequence, we have functions that do flow control. In C looks a little bit awkward, in Lisp is much better. The example above would be: “repeat(20); send(); receive(); analyze (); phase_shift(); iterate();” The guy who writes that “code” does not care about the base language (we had previously never heard about Lisp, he was only able of basic Python). But he was already writing those programs in pseudocode for documentation. So the cognitive load for him is minimal. He has to remember to add “();” at the end of each instruction, and the loops are “repeat(n) … iterate” That’s it! That was much less, than if he had to learn the whole API of the ASIC, he is not a programmer, he is an RF engineer. You may say: it is an API, but look, there was already an API. Makes no sense to do API over API. It was all about transforming the language of the API, to the language of the problem at hand. The API tries to expose every detail of the hardware, in a language which is based on hardware and C, the DS language tries to hide details or translate things into the language of the problem. So the user of the DSL has to learn less.

2) There was an automated planner which lots of rules. Think about it as “1000 ifs, some nested”, originally without DSL, all was hardcoded in C++. We developed based on libconfig (think JSON with C syntax) a little language to express the ifs. Note: there was no new syntax invented, it is the underlying JSON/Libconfig, which are well known syntax. We only made a big “forach” for all elements in the config file, and each passed in a big “case” to dispatch the substructure to the handling function for each instruction. Took 1 day to implement. After that, the intelligence was in separated files, it could be reloaded dynamically, and the people doing the intelligence did not need to be C experts.

a DSL has to have its own syntax

If it's the same language it can't be a new language. You didn't link anything with your sources.

The whole idea of a DSL is exactly to avoid learning something new

But you have to learn the DSL and you have to throw away all your tools. These are two big problems they introduce so the problem they solve better be big and tools/debugging needs to be part of making the DSL. This is why a small DSL is not a good idea.

We had a very complex ASIC which had a complicated way of configuring it: there were RF parameters

This is another side of the story. Passing parameters is data. Inside a program this is a very bad idea because you can already pass around all the data you want any way you want though function calls and memory layouts.

Passing data from one program to another or one computer to another is different, but then that isn't a language, that's a data format like any other file. GCode is a list of 'commands', but fundamentally it is a data format. If you look at the .obj format, it is ascii and needs to be parsed, but not thought of as a language.

Think about it as “1000 ifs, some nested”, originally without DSL, all was hardcoded in C++. We developed based on libconfig (think JSON with C syntax) a little language to express the ifs

This sounds like a data format. If something isn't being executed directly, it's data. If it is being executed directly, don't make a new language, because it takes a decade and hundreds of people to get it to work well.

I would really want to have a face to face conversation, because I see you have genuine interest in the discussion, it seems we are talking past each other.

> If it's the same language it can't be a new language. You didn't link anything with your sources.

A language is more than the syntax. For example common lisp, emacs lisp, racket and scheme are different languages with exact same syntax. Java and C have very similar syntax, but are 2 languages. Source SICP https://web.mit.edu/6.001/6.037/sicp.pdf or the videos in youtube.

A DSL does not need to have a new syntax. Source wikipedia article, under embedded DSL.

If your DSL follows existing syntax, you can use the tools. Note my example with JSON.

>> Passing parameters is data. (…) Passing data from one program to another or one computer to another is different, but then that isn't a language

Well actually it is. And data and code cannot be tell apart. I can only recommend to go throw the SICP lectures in youtube. Your example with GCcode is good, code is data, data is code. Also about the example, consider it is, as said, a great simplification, there are lots of details and constraints that I cannot possibly enumerate here. Also note that one way of passing data between 2 computers can by done via RPC which is a language (procedures and functions are called remotely, executing code in the remote computer, which works with the data) that was actually the case in the example.

> This sounds like a data format. If something isn't being executed directly, it's data. If it is being executed directly, don't make a new language, because it takes a decade and hundreds of people to get it to work well.

A C program is also a data format. All is a data format. At the end in the compiler or interpreter the program is an AST, ALWAYS! And an AST ist just a data structure!

> common lisp, emacs lisp, racket and scheme are different languages with exact same syntax

Far from it. On the s-expression level there are already differences. On the actual language level, Common Lisp for example provides function definitions with named arguments, declarations, documention strings, etc.

For example the syntax for function parameter definition in CL is:

    lambda-list::= (var* 
                    [&optional {var | (var [init-form [supplied-p-parameter]])}*] 
                    [&rest var] 
                    [&key {var |
                           ({var | (keyword-name var)} [init-form [supplied-p-parameter]])}*
                    [&allow-other-keys]] 
                    [&aux {var | (var [init-form])}*]) 

Above is a syntax definition in an EBNF variant used by Common Lisp to describe the syntax of valid forms in the language. There are different operator types and built-in operators and macro operators have especially lots and sometimes complex syntax. See for example the extensive syntax of the LOOP operator in Common Lisp.
You keep saying there is some mythical "DSL" that isn't actually a new language, no new syntax, works will whatever tools (no word on what language or what tools), not an API, "adds semantic value", but there are no examples after all these comments.

Well actually it is.

This is conflating the term 'language' to mean whatever you want at the moment. There are things that execute and things that don't. These two should be kept as separate as possible, but this is a less that people usually need to learn for themselves after being burned many times by complexity that doesn't need to be there.

And data and code cannot be tell apart. I can only recommend to go throw the SICP lectures in youtube

A C program is also a data

You aren't the first person to be mesmerized by SICP, but if someone gets involved in thinking something is a silver bullet, they will tend to try to find information that validates this belief and reject info that doesn't. This pattern is found elsewhere in life too.

To understand some context, early in the life of LISP and Scheme, there weren't as many scripting languages and people mostly hadn't had a lot of experience with being able to eval tiny programs in their programs. These days that might be used to enable people to write small expressions in a GUI instead of a constant parameter. Many times in programming history people see something new and think it will solve all their problems.

Java went through the same thing. For a long time people though deep inheritance hierarchies would save them until gradually people realized how ridiculous and complicated it made things that could be simple. Inheritance from a base object let people use general data structures and garbage collection + batteries included seemed great, but programmers conflated everything together and thought this terrible aspect of programming was a step forward.

Lisp was very influential, people didn't have scripting languages back then but it isn't a modern way to program.

Data formats are a separate issue and mixing in execution to those is a bad idea too, because the problem they solve is getting data into a program. When you put in execution you no longer know what you're looking at. Instead of being able to see or read directly the data you want, now you need to execute something to see what the values actually are. When you need to execute something you have all sorts of complexity including the need to debug and iterate just to see what was once directly visible.