Hacker News new | ask | show | jobs
by wahern 603 days ago
When I think of a DSL, I think of a language with specialized syntax, grammar, or constructs suited to the problem domain. Think SQL, AWK, or regular expressions. This is just a LISP variant with a typical host-side API for registering function names.

I'll never get how merely having function names that reflect the use case, plus a stripped down or absent standard library, qualifies as a DSL. I know some people have long used "DSL" in this way, especially among LISP fans, but... I just don't get it. If I want a DSL it's because I want something that gives me, e.g., novel control flow constructs a la AWK, or highly specialized semantics a la regular expressions, that directly suit the problem domain. If I'm not getting that kind of power, why tie myself to some esoteric dependency? Either way you're adopting a tremendous maintenance burden; it better be worth your while.

I'm a huge fan of Lua and have used it for many projects in different roles, but never once thought of any particular case as having created a DSL, even when stripping the environment to just a few, well-named, self-describing functions.

I don't mean to criticize this particular project. Good code is good code. It's just the particular conceptualization (one shared by many others, to be fair) of what a "DSL" means that bugs me.

11 comments

> I know some people have long used "DSL" in this way, especially among LISP fans

generally this would be called an "embedded domain specific language". Some languages are relatively flexible to change the syntax. For example Common Lisp has reader macros to change the token syntax and macros to change the Lisp syntax. With that one can create all kinds of embedded languages, incl. domain specific languages (languages which are specific to a special domain). Examples would be embedded logic languages, query languages, rule based languages, languages to describe user interfaces, etc. The Common Lisp standard has a notorious example for that, a complex LOOP construct, which uses a very different syntax: https://www.lispworks.com/documentation/HyperSpec/Body/m_loo...

There are other real-world examples out there, for example an embedded domain specific language to describe 3d objects in the domain of parametric CAD, for description of technical things like turbines or other parts of an aircraft.

I would agree with your sentiment.

This is basically a lib with some extra syntax to parse CSV Files.

A 'proper' DSL would require a very specific domain where it is applied to. Like document creation, or solving a certain problem and only that but not much else. Turing completeness is usually not required as well.

For example Matlab or LaTex are domain specific as well as SQL. Those are used to do math, create documents or do mangle tables.

Imho just renaming forEach to Map to parse CSV files does not fit the bill so the linked example is not that great.

This project basically a DSL builder thingy with a text processor demo.

As an aside:

I am missing the most important thing when it comes to CSV which is configuration of the input.

Might be because this is more of an example but it is usually a sign of a lets say more academic project.

Working with CSV, is usually a source of a lot of "good fun" where many hours can be spent.

Because your average CSV is often a SSV or TSV in some ISO that is everything but not UTF8. It usually contains line breaks which have been renamed to funny icons by some combo of tool and operating system. Also there are weird escape character in orders which are not consistent on every row. Also sometimes you have titles sometimes not. Dates make no sense, are language depended and in a weird format the intern made up 10 years ago. And even numbers are weird too, like 12e^-25 or '0.00' or '10.000,0'. Then you get columns which really should be 2 or more, or there are lines which span multiple rows.

Ihmo it is way better for robust CSV parsing to have a really low level approach where you rigorously check for everything (and return everything to sender that does not fit).

> I am missing the most important thing when it comes to CSV which is configuration of the input.

1. Given that their stated use-case for this tool is generation of data for testing purposes, why would input configuration be relevant?

2. It doesn't have any CSV-associated features to begin with. If for some reason you wanted to use this for that, you would interact with your own bound Go functions to facilitate working with CSV, just as demonstrated in the article. Within those functions, you can setup any configuration your heart desires.

DSLs don't need to be supported by a separate lexer/parser. Some (for better or worse) use standard formats like yaml or json. Some (ie embedded DSLs) are represented using terms defined in a programming language.

Any naming of types, functions or variables that you do while solving a problem in software is creating a "language" of terms that are specific to the domain.

A well-constructed fluent API can read a lot like what you would call a DSL. A configuration language in YAML is both YAML and a DSL for config.

You might be interested in this classic pdf: http://www.cs.virginia.edu/~evans/cs655/readings/steele.pdf (Growing a language)

Without those function names that you talk about, you wouldn't really recognize a language. Those standard library function names give the "batteries included" kind of feeling. The size of the community libraries furthers that accessibility and productivity feeling of the language. Furthermore, you can certainly create control flows via libraries--with callbacks specifically--and create any kind of novel branching structure your domain had need of.

A good API is a sort of DSL. If it well reflects how you think about the domain and helps you express instructions within that domain. The language can be very different. We experience that in our own language when we hear people talking with heavy use of jargon we don't recognize: they might as well be speaking another language.

But overall I agree with you that I would probably save the use of the jargon term DSL for novel syntax targeted at a specific domain.

> When I think of a DSL, I think of a language with specialized syntax, grammar, or constructs suited to the problem domain.

I think that's too strict. For example, JAX is (imo) an eDSL, but it doesn't have a specialized syntax, grammar or constructs - on the contrary, it's meant to feel just like Numpy. The thing that makes it special is the interpretation of those constructs.

JAX isn't an eDSL, because it doesn't obey full Python semantics (namely, changing what a method does at runtime).

Mojo just discovered it ain't happening: https://news.ycombinator.com/item?id=41927000

I don't think it has to "obey Python semantics" at all to be an eDSL.
I don’t think there’s need to have such strict requirements. No need to invent a whole new paradigm, you can go with the usual ones as long as it works well. More often than not the domain is not that complex or flexible to require a language, it may only requires a few algorithms (libraries) ie even if you do invent a language, few programs will be built with it. You may as well tweak an existing language for a nicer DX
This is a very good and interesting point, but what if the point itself is to reduce the power and increase things like legibility.

If I create this kind of mapped functions DSLs I can assure that things will be done a a certain way vs the borderline infinite possibilities of code.

Lisp macros can make their own control flow: it's like the relation of LinQ to F#.
Well, could you give some good examples of how is lua used in your projects in different roles? I'm curious.
lingo is a _framework_ for building your own DSL for your go application.

and by DSL they mean "extension language". like vimscript or emacs lisp. or guile or python...

so you can easily add primitives to your application specific language, in go, specifically.

https://gitlab.com/gitlab-org/vulnerability-research/foss/li...

I think this fits nearly with the common understanding "DSL" doesn't it?

https://en.m.wikipedia.org/wiki/Domain-specific_language

Yeah. I mean, isn't CSV itself a DSL? It can't execute, but it's a domain-specific markup language for structured data.