Hacker News new | ask | show | jobs
by eduction 1019 days ago
We stop calling the useful ones “DSLs” so this is a truism.

Is SQL a “waste of time?” Regular expressions, HTML, Makefiles, CSS (and CSS selectors aka jquery selectors)?

It’s the bad ones that are a waste of time. The ones still called “dsl” instead of just “language, “format,” or “syntax.”

8 comments

> Is SQL a “waste of time?”

It certainly wastes a lot of my time because it is considered just a DSL and not a "real" programming language, and therefore doesn't get the same kind of attention towards making it better that "real" programming languages get. Maybe there is something to the original thesis.

Like, why can't I import shared query modules to compose into my queries? I can't imagine any other programming language without that feature. The execution engines support it in concept, allowing you to get there with some painful copy/pasting or using code generation (which seems to be how most people use SQL these days[1]). But because SQL is seen as just a DSL it doesn't get any attention towards making the language better.

[1] Which I'm not sure SQL is all that good at being the assembly language of databases either. It is strangely irregular.

I feel like the things you don't like about SQL are explicit design choices that make it well suited for many tasks, but perhaps not whatever you are trying to do.

It's like, "Geez, regular expressions sure do suck. They don't come with a package manager, concurrency, or a way to make system calls." I only feel comfortable embedding regular expressions or SQL in code because they are inherently restricted "DSLs."

If SQL was a general purpose programming language, then we wouldn't need SQL, we would just use existing general purpose programming languages.

Assuming you have some use case for query composition, why is the feature better handled within SQL as opposed to in the general purpose query language managing the DB queries and making the calls?

> why is the feature better handled within SQL as opposed to in the general purpose query language managing the DB queries and making the calls?

If the environment hosting and managing the queries (e.g. psql) is open source, maybe you can hack in support. Often that is not the case. Even when you are able, then your work becomes non-standard and thus not easily shared with other analysts, which kind of defeats the whole sharing of modules.

If you are writing a full-fledged application, using "real" programming languages, then a lot of people build their own query languages on top of SQL, treating SQL as just an "assembly language" that their query language compiles to, to add what SQL lacks. That's all well and good, but totally overboard when you just want to answer some questions about your data.

I do think you make a point that SQL tries to be too many things to too many people. No question, it is useful being able to embed SQL strings into other code without needing to worry about complex dependencies... Except you still have that worry as we already tried adding that composition through stored procedures, views, etc. which are dependent on a particular database state being true. It's there, just not very well thought out from a developer ergonomic perspective. Indeed, SQL is confused.

I think your examples are spurious and wrong, when compared with the specific kind of DSL the article talks about. SQL is a "Fourth Generation Language", much like Prolog, which is quite different to what is being discussed. HTML isn't a DSL; it's a type of SGML, a specific document markup language - you can't just shout "DSL" and squint a bunch and assume "All sort-of-languages are DSLs". Regular expressions, sure, but that's not relevant here because it's so universal - and mathematically backed (ever done a CS degree?) - that it's embedded inside every language now - i.e. this representation is a somewhat fundamental and mathematically general representation of interacting with strings - the DLSs in question here are, quite obviously, nothing of the sort.

The article is right. DSLs are, universally, awful, as they grow and grow until they encompass all features of a general programming language - but because of their evolution path the're awkward and twisted and lack basic features of a general programming language (and definitely lack all the tooling you get). Witness what happened with XML as a configuration language. What started off as another leaf node under SGML ended up - by way of the Java ecosystem - growing if statements and loops and all the rest, only in a horrendous, hard-to-read and harder-to-debug monstrous language.

So he's right. We should use real languages. And we should probably all just use Pulmi.

I think the problem here is more that people still need to work on their positions to figure out what they really mean, because while you have some points, regular expressions are the canonical example of a DSL. Their domain is strings, and they make certain things you want to do in them very easy to express.

The problem here is that while you can say “Terraform is more than a DSL” - if you ever back the layers or try have a conversation with someone who’s a true dyed in the wool Terraform zealot and start to try and understand where they’re coming from, you begin to realise they don’t love Terraform, they love this weird half language because they’re an expert in it, and they’ve built their entire career on being an expert on something you can only use for one specific use case.

That's from the article, and that really sounds a lot like talking to someone that knows and loved regular expressions (and I count myself as one of those people). The alternative, which is often done in lower level languages, is either some simple nested conditionals or a state machine, and plenty of people choose to just write state machines. I mean, that's all a regular expression is anyway, a shorthand expression for building state machines to process strings.

It should be no surprise that there are plenty of people that think regular expressions are wasteful and you should just write your own state machine (fewer than there were in the past, but they exist). Regular expressions are exactly like everything else that's being discusses in the article, but many people like them. Perhaps that's a point we should focus on to determine what makes a DSL successful and good compared to the alternative.

If you re-read my reply, you'll see that I said that regex is a DSL, but doesn't apply for different reasons.
I did read your reply, and I think your reasons for excluding it are insufficient. You can't just exclude any DSL that succeeds (has become "universal") and say "see? no DSLs are good!" The point of me noting how people resisted (and still resist) regular expression usage is to note that just like any other DSL there are those that dislike it and eschew its use and instead just write the code. It is exactly what the article is talking about, where it advocates writing in the base language and not using a DSL.

If you're going to make a serious argument that "DSLs are, universally, awful" then you're going to need to account for regular expressions a bit more carefully. That can be you admitting that you just dislike them and don't use them so consider them awful as well, but if you don't hate them then you may want to focus on the why, and it is likely a much more interesting conversation topic to pursue than "DSLs bad".

Regular expressions ... this representation is a somewhat fundamental and mathematically general representation of interacting with strings - the DLSs in question here are, quite obviously, nothing of the sort.

https://en.wikipedia.org/wiki/Regular_language

Regular expressions have a deep mathematical background. An ad-hoc DSL from a company that specifies infrustructure does not.

Regular expressions, as used in practice in programming are mathematically general for interacting with strings, but they are in no way fundamental. They are a DSL for generating NFAs and DFAs (depending on the underlying engine). There is nothing fundamental about them, they are just a shorthand (a DSL) for those abstractions. Indeed, the very article you posted notes that anything able to be implemented as a finite automaton is also a regular language due to Kleene's theorem.

> Regular expressions have a deep mathematical background. An ad-hoc DSL from a company that specifies infrustructure does not.

You very specifically went far beyond making any statement about ad-hoc DSLs from companies that make infrastructure. You said "DSLs are, universally, awful".

I'm also interested why you left out makefiles in your original rebuttal. I think makefiles are a good example a simple and useful DSL for accomplishing what it sets out to do. That's not to say it can't turn into a mess (what language, domain specific or not, is free from that concern?), not that it solves all needs to all people, but makefiles are simple, straightforward, solve a useful problem, are well known by many people (and easy to explain and teach), and I don't think most people would consider them a mathematical concept core to computer science. Are they awful? If so, why?

The real issue here is a DSL without intention of enforcing limitations permanently.

A Regex, SQL, TLA, CSS and so on are all better DSLs than allowing a general purpose language because they are meant to restrict inputs of lower priority than other goals, where priorities are largely fixed.

A corporate DSL is meant to expand to whatever grandiose plans the company has for accepting impractical input from platinum customers.

Aren’t regular expressions the abstraction to state machines? They all get converted to a DFA or NFA, no?
Yes, and I said as much. All DSLs are abstractions to some sort of code that does some action. What I'm saying is that we have examples of bad DSLs, and good DSLs, and perhaps by focusing between them we can come up with some useful information about what distinguishes one from the other.

To me, that's a much more interesting (and useful!) discussion than just piling onto the DSLs are awful bandwagon, and I also think it's a flaw in any argument put forth in that argument that needs to be addresses before I can accept it.

Exactly. The languages in the article are nothing of the sort.
> DSLs are, universally, awful, as they grow and grow until they encompass all features of a general programming language

This is the Slippery Slope Fallacy writ plain.

A great many DSLs are stable over years or decades. By your reasoning, a stable DSL is bad because it's going to grow. Why is it going to grow? crickets

Peeling away the fallacy, we could perhaps agree that "the uncontrolled growth of a DSL is an evil best avoided." Or if I'm being less generous, we could look at the bottom of that slippery slope and conclude "general programming languages are bad," because that's what the author seems most incensed by.

General programming languages were meant from the start to be something like what they are today. They're intended to manage complexity, not pretend nobody will ever want to do complex things.

They are also not application specific. Language design is hard. DSLs are made up by people who's main project isn't the language.

Sometimes the motive for a DSL is something like "I don't like how this takes 5 lines of code in JS, if I made my own language it would only take 3"

The random "ever done a CS degree?" is weird, and the premise that regular expressions are fundamental and mathematically general and therefore not a DSL is flawed: ebnf is also like that, and even more general. It _can_ parse HTML. Doesn't stop it from being an arbitrary representation of abstract syntax patterns. It's a DSL through and though and so is Regex. Actually pretty hard to find that pure of a DSL, which such a defined domain, as many DSLs evolve to be more general language like over time as the article says.
Well, I remember them teaching that stuff, and regex is quite clearly a very different sort of DSL than the ones being discussed. That combined with the comment author's seeming lack of nuance made me wonder if they'd ever looked deeper than "all things without a for loop are a DSL". I think we are getting too far into the semantics here though, at risk of splitting hairs and geting lost in the details. The core point is that the article is talking about some quite specific types of DSLs, so you can't say respond with "No, DSLs are useful" and then list off a bunch of unrelated stuff as a rebuttal.
I was replying to a very unuanced article, with the headline “DSLs are a waste of time” and no further hedging on that headline in the text. So ya, I knocked down the argument on the argument’s own (very broad) terms.

No need for the personal swipe.

> The core point is that the article is talking about some quite specific types of DSLs

It used specific types of DSLs as evidence but I don’t see a narrowing of the claim itself? Where are you getting that?

> No need for the personal swipe

Fair enough

It used specific types of DSLs as evidence but I don’t see a narrowing of the claim itself? Where are you getting that?

What makes me sad, as I posted in a top-level comment, is that I read this exact same rationale 20 years go, by Steve Yegge, and yet we are still having the same stupid debate in programming about whether we should use DSLs and then the DSL grows and grows and, lo and behold, it's Turing-complete, but its ergonomics are dire and its tooling non-existant. And it's not like this is some niche part of the ecosystem - it's how most people deploy infrastructure.

What I despise is seeing the same problems come up again and again, and never seeing good solutions to them, jsut the same mistakes, repeated. Programming is, truly, terrible. Just a load of slaves building the Pyramids.

I don’t think it makes any sense to dismiss regex “because it’s so universal” — you’re disqualifying a DSL because it’s successful, then saying all DSLs are unsuccessful. It’s just a tautology at that point, same as the article.

(And SQL is not a dsl because it’s a … programming language? But also HTML is not a domain specific language precisely because it’s not a programming language? I don’t follow your logic at all. I don’t think either are programming languages, but both are languages in another sense of the word, as evidenced by their full names, and certainly dsls.)

I would be comfortable calling HTML and SQL both DSLs. Given the scope of their use.
Seems like the bad DSLs happen when one attempts to express a subset of general purpose imperative computation in an application specific way.
Untyped lambda calculus FTW!
It is. But nobody listens. Ever.
JavaScript was a cute little DSL once too!

I feel like the author missed a more obvious comparison between Puppetlang and HCL: vendor specific languages.

When you consider them from that perspective it’s clear that the decision is based more on which company you trust to serve your long term needs over any point in time implementation details. Is that company Puppet? HashiCorp? Pulumi? The implementation details obviously matter but if you’re investing 7+ years into an ecosystem like the author did, then there are a lot more factors than just syntax.

Yikes! JavaScript is probably not the best example of a “good” DSL. It’s a crap language that benefited largely from being the only game in town re browser access.
I mean how you define “good” is an endless discussion by itself. For the purposes of this discussion I think considering JavaScript a successful and useful DSL are sufficient. I share your opinion that it is by no means “good” by more mechanistic measures. :)
I consider JS a pretty great language these days, my only complaint is the lack of a batteries included standard library that does the stuff underscore et al do.
This is a good point but also a little circular since the good vendor specific languages tend to break free — like Netscape’s (and nominally Sun’s) JavaScript :-)

(and I think SQL started at IBM but I’m not sure if they tried to keep it proprietary or make it a standard)

Sometimes! Google still controls Go. Microsoft has retained control of dotnet. Java more successfully escaped Sun/Oracle, but that ecosystem is still governed by a very small number of mostly very large companies.

Regardless the author getting 7+ years of runway out of a specific technology is hardly a waste! My average technology switching time is probably closer to 5-6 years.

True! I think used Perl about that long, and probably Ruby after that. And those are (obviously) general purpose.
JavaScript wasn't really a DSL though was it? Not in the sense of this article. Brendan Eich was hired to embed Scheme into Netscape Navigator. That's not the same as the kind of crippled templating languages this article is discussing.
One is procedural (JavaScript), the other is declarative (HCL).

Both are (or at least were originally) “crippled” compared to “real” languages.

Both grew immensely each eventually escaping their single-vendor origins.

I don't think that this is a true comparison. JavaScript was limited mostly by its sandboxing, not what you could write in it. You can't really compare that to what HCL is and does.
It's like (speaking) languages really. The good ones are ones that have always been there and people have just sort of accepted into the mainstream (like English). Don't invent your own language. If you can do it in an existing language, don't invent your own.

SQL and CSS selectors fit the bill. Everyone has just kind of accepted them now. It's not an excuse to try creating new languages for every new thing, it's an uphill task.

If you can do it in an existing language, don't invent your own

then we should all be using c, lisp, smalltalk and perl (add your favourite old language here). because most younger languages don't bring anything new to the table.

The good ones are ones that have always been there

esperanto? not exactly popular, but it is definitely better than english and solves problems that english can't. (it's just that people don't realize the problem yet)

I guess I should qualify it by saying "Domain Specific Language" instead of "language". Newer programming languages are different because they have a huge body of extremely clever people working on it making sure it's up to standard. DSLs do not reach the same amount of rigor, they're just there. Some parsers loosely thrown together. Where's the documentation? The specification? The language servers, the editor support? None. Don't invent your own DSL (other than for yourself).
i am just picking your examples apart :-)

i think we both want the same outcome, that is well designed and useful languages, whether they are DSLs or not, and i agree that most DLSs probably aren't that. but not all general languages are getting that kind of rigor from the start either. some get it rather late after they became popular enough to have demand for fixing the problems that stem from their initial ad-hoc design. i am looking at javascript and php here in particular, but there are probably others.

DSLs, due to their nature are less likely to ever reach the level of popularity where its worth it to fix design problems. general languages on the other hand are more likely designed by people who care and without the pressure of a quick solution. (and its my understanding that javascript was a quick solution, which would kind of prove the point)

esperanto, btw, was designed with rigor.

My take is that all embedded DSLs are bad. By embedding your DSL in another language you might gain certain things for free, but you also sacrifice a lot of control and often extra boilerplate code is necessary.

The exception of course is Lisp where embedded languages can feel like standalone languages.

Stated in another way: standalone are sometimes bad, but embedded DSLs are almost always bad.
From what I’ve heard of and used myself LINQ works pretty well as an embedded DSL because it’s more like a language extension with some solid semantics rather than a classic embedded DSL that is not terribly far from transpilers
According to the article, the alternative to DSL is the programming language. So what the author understands as a DSL is some special-purpose language that is self-contained with its own run-time such that you have to string multiple of these to solve the task. Possibly, the author might have

The article doesn't discuss the possibility of building DSLs with the programming language and using them, all integrated together.

Yes, CSS is crap for instance. The web should have one language for everything in which styling is a DSL.

Regular expressions are better when they are in your language:

  2> (regex-compile '(0+ (or "a" "b")))
  #/(a|b)*/
E.g. we can cleanly stick something into a regex's syntax tree without worrying about escaping:

  3> (regex-compile '(0+ (or "*" "b")))
  #/(\*|b)*/
  4> (let ((x "[")) (regex-compile ^(0+ (or ,x "b"))))
  #/(\[|b)*/
SQL would be my #1 example too.

It is probably the ultimate DSL if you are willing to get deviant with your tech stack.

it's kind of a catch-22, right? SQL is declarative. And while people seem to love declarative DSLs, they are generally really only awesome on the happy path. For SQL, sometimes, it can be incredibly obtuse when something goes horribly wrong (in terms of query speed) -- even with great tools like EXPLAIN. And a query that is fast on one SQL might not be so fast on another SQL.

For other declarative DSLs, I've definitely bashed my head against the wall and gotten a few gray hairs over "why the f* doesn't this goddamn thing that should work work". Sometimes it's because I'm not grokking how the platform works. Sometimes it's a bug, or unimplemented feature.

I suspect the reason why people don't absolutely abhor SQL is that the biggest players (MySQL, Postgres, Sqlite) have actually done a reasonably good job of making the 98% paths very good and 90% of the rest "good enough".

> the biggest players (MySQL, Postgres, Sqlite) have actually done a reasonably good job of making the 98% paths very good and 90% of the rest "good enough".

Exactly - Unless you are doing something really unusual (or wrong), the vendors of these engines have almost certainly encountered and optimized for some shape approximating your scenario.

If you want to get into some extreme ends of the practice, "ancient" engines like DB2 are some of the most capable. Many have been around longer than most developers today have been alive (myself included). That is a lot of optimization legacy to argue against. The halloween problem & iceberg meme comes quickly to mind. Why wouldn't you want to stick with something that has already dealt with all of that bullshit?

Eh, if you're at the point where you're doing stuff like "give me all records where a one-to-many has exactly two entries, and a different one to many has at least one record with property x" it gets VERY hard to do the correct performant SQL. I don't think that sort of query is too bizzare.
Sure. But you can create a declarative DSL without having to copy SQL.
> We stop calling the useful ones “DSLs” so this is a truism.

Bingo!

Lua was born as a DSL for configuration files. Today is way more than that.

JSON was born as a DSL for describing data, a less verbose alternative to XML. It is more than that.

JSON never got new features. We just made new things around it.

Which is why it's amazing for data interchange and obnoxious to directly interact with by hand.

It should be a rule that all mature configuration languages become turning complete