Hacker News new | ask | show | jobs
by AlotOfReading 1302 days ago
People have been making this argument since the 80s and possibly even earlier. My experience is often the opposite. Little languages are usually far, far harder than (mis-)using "big" languages for small tasks.

The problem is that your DSL has to be understood by other people, including future you. Programming tasks are vast, combinatorially explosive state spaces full of weird potential interactions between features. Once you get above the complexity and universal familiarity of say, arithmetic, it's difficult for others to understand what's going on just by looking at 1-2 live examples. You have to heavily invest in proper docs and tooling (if your language doesn't provide it for free). By the time you've completed that your "little language" usually isn't such a little effort anymore.

If you don't, you've just made the next CMake. Congrats you monster.

18 comments

Indeed.

That's why we have languages with functions now, because people didn't want to manually do a register dance in assembly.

That's why we have name spaces, because naming conventions only take you so far.

That's why we have map and filter (or equivalent) because that's what most loops are doing anyway.

Generation after generation, we discover that we all use common abstractions. We name them design patterns, then we integrate them in the language, and now they are primitives.

And the languages grow, grow, bigger and bigger. But they are better.

More productive. Less error prone. And code bases become standardized, simple problems have known solutions, jumping to a new project doesn't mean relearning the entire paradigm of the system in place.

Small languages either become big, or are replaced by things that are big, for the same reason most people prefer a car to a horse to go shopping.

Not that horse ridding will totally disapear, but it will stay in their optimal niche, they are not "the future".

I think one of the biggest counter examples to your argument is SQL.

It's been around a long time. It's not general purpose. It's considered the best option there is if your setup allows you to use it.

SQL persists because it's an interchange format, not just a programming language. It's one of the few programming languages you'll see embedded in other languages - and generating SQL from other languages is a common source of security bugs. You can't upgrade away from SQL without changing both ends of that connection.

You can write large programs in SQL, but it's generally considered good practice not to.

(I feel I ought to mention LINQ here, not to make any specific point but just to fanboy about it)

For LINQ, I personally vastly prefer the fluent version (IEnumerable.Select(..).Where(...) and so on) over the SQl like syntax.
same, and in fact, despite using C# since before LINQ even existed, I don't even know how to write the sugar candy version of it.

Part of that is me coming from C++ and its algorithm header and the other part is that the code is just easier to read and understand than the sugar candy version (to me, atleast).

As to not repeat the same argument: https://news.ycombinator.com/item?id=33704199
It's still quite little compared to general purpose languages with similar expressive power in the same domain.
Plenty of people prefer language-integrated things like LINQ over SQL, although we can argue whether or not LINQ represents a kind of DSL.
SQL is a beast of a language and there's definitely room for improvement. Also it's not Turing complete
Mysql 8 is turing complete with the recursive common table expressions
Somewhat pedantic argument: SQL is kinda dead. Sure, modern databases use upgraded dialects of it, but they are custom to each database and often incompatible with plain SQL. There are even many cases where modern databases don't support even standardized SQL constructs.

The easiest example of where databases and SQL part ways: UPSERT. It doesn't exist in standard SQL.

ref. https://jakewheat.github.io/sql-overview/sql-2016-foundation...

Many modern products use SQL. Supabase and BigQuery comes to mind, but the examples are basically inumerable.

There are others of course. But I'd rather call SQL challenged (by graphql etc.) than dead.

SQL has MERGE
Forest for the trees, my dude.
Then let me address the forest. Modern DBs offer SQL as an on-ramp for the most common DML and querying. This further entrenches it as a data language and why SQL just won't die.

There are always more proprietary methods where one needs them. That doesn't mean SQL is dying, it means SQL will likely grow to include some of those too.

> Small languages either become big, or are replaced by things that are big, for the same reason most people prefer a car to a horse to go shopping.

So why are shell languages still around? Why are they not replaced by C#, C++, Java or another big (=general purpose) language?

I find your horse->car comparison more akin to the sh->bash->zsh transition. Zsh is not as small as sh, but still it is in the small league is you ask me.

Small does not mean w/o functions, without NS, without map/filter: it means "not general purpose".

Shell languages are a very good example, because they have been replaced mostly by bigger languages. First by perl, then by python.

Now a day, most people don't write bash if it must be more than a few lines: it fits niche perfectly, like horse ridding.

But you are not going to do website with bash anymore, server swarm deployment with powershell or build your encoding pipeline in fish. Tasks that we used to do using those small languages, until we found out that we prefer a car to do shopping.

> server swarm deployment with powershell

I do this routinely with powershell. Powershell is not bash or fish, its highly usuable in all situations except those that require greatest performance and even then there are solutions for various types of problems.

Do you prefer a car or a touring bus when you go shopping? Because making an analogy between a car and horse doesn't really make any sense.

Why not have a niche language for making websites? Or for server swarm deployment? Or a video encoding pipeline?

>So why are shell languages still around?

Are they? I don't use them more than once a year

I'd say you're in the extreme minority then. I'm not a bash hacker but I usually end up writing a little script for myself at least once a month. Even just doing `command && command` is technically using a shell language
So you are using bash once in a while (once a month is not a lot for a programmer) for niche use case (the specialty of bash: short scripts).

Hence, you are making the point "little languages are not the future". It has found its local optimum.

I said I write custom scripts about once a month. I use bash literally every day. But I also wasn't arguing the future of programming languages. The parent here said that shell languages aren't being used
Still alive and quite well. In fact, as more and more programs are written, they become more and more useful. We're already well into an era where programming can be nearly entirely ignored in favor of merely scripting new behavior out of the interactions between existing programs.
Shell languages have "must be easy to type in execution order" and "must integrate with random programs on the filesystem" as an overriding consideration. You're not going to get that with Java.
gawk compiled via webassembly to allow for running in browser allows 'modern' gui input / output beyond the command line interface while still retaining the ability to be just a cli program.
The irony of your horse analogy is that a horse is more general purpose than a car. A horse can travel along train tracks, roads, sidewalks, and hiking trails.
There’s two things that the article mentions that your omni-language has massive problems with:

1.) Performance. For example a run-time for a user-friendly little language that maps HTTP requests to SQL queries can be much faster than a language that does the same thing by plugging user-friendly APIs together. A custom run-time can parse an HTTP request string directly into SQLite op code while the JS developer is writing glue code that takes orders of magnitude more memory and CPU time.

2.) Static analysis. This means tools that are better at finding bugs, finding optimizations, visualizing structure, etc.

Both of these things are theoretically true, but only if the little language has enough resources behind it to optimize and build enough tooling. A big language is much more likely to have those resources because the target market is big enough to justify it. A niche little language will likely never get that kind of mass behind it, so the tools will be lacking and the runtime won't be optimized.
Big general purpose languages don’t have the capability for certain kinds of performance improvements or static analysis so the market size isn’t a factor.
My point isn't that the big languages can do everything that a little language could theoretically do, it's that the little languages won't have the resources to pull them off, nor will they have the resources to even do what the big languages do.

Proper debugging, syntax highlighting, language servers, security audits. These are things that engineers in the real world expect a language to have, and each little language would have to reinvent each of them. In contrast, a library can piggyback off of the tooling provided in the host language.

So even if a language can deliver on the performance and static analysis that it promises (which few will), it cannot reach adoption because it cannot provide the infrastructure needed.

(That doesn't need even get into the onboarding concerns that I and others have raised about having a codebase that is strung together from a dozen tiny languages.)

Proper debugging, syntax highlighting, language servers, security audits. These are things that engineers in the real world expect a language to have, and each little language would have to reinvent each of them.

I don’t always see this as the case and this might be a very fruitful area for research. What I mean is, much like how we have tools like bison, antlr, and the k framework, I could easily see this notion extending to language servers, etc.

As for onboarding, well, who are we onboarding? New software engineers working on a general purpose language or new operations members working on a DSL?

I remember a time when CSS and HTML had yet to be consumed by a general purpose language and the onboarding for new web designers was significantly easier.

I don’t want to spend all day playing human debugger because your little language doesn’t support debugging.

I’m not here to tell you how very clever you are for reinventing the wheel. I have my own shot to get done and that’s easier when we make a smaller language out of our general purpose language by agreeing to style guidelines, instead of avoiding solving social problems with technology by creating your own game to change the rules.

When everyone in your group expects the rest of the team to commit 10% of our attention to their tool or module, that doesn’t scale. If we go more than 50% total our capacity to solve problems becomes hamstrung.

If you work someplace where new devs are useless for a year, you’ve likely already got the snowflake disease.

What does any of what you’ve written have to do with the concerns raised by the author of the article?
There are certain kinds of analysis a non Turing complete language can have that a Turing complete language can never have.
I think I figured out the issue in this discussion. The article is actually about general purpose vs domain specific and not really about big and little… and unfortunately since most “DSLs” are APIs written for in general purpose languages the author resorted to using the term “little language”.
In your story, the small language fits local optimum, and becomes popular in its niche where it's good at.

So small languages are not the future, which is what the article suggests.

They are either niche (bash), dead (tcl), or gets big (sql).

1.) Bash is not niche.

2.) SQL is not a general purpose programming language and is used as an example of a little language by the author.

I can clearly imagine a future where instead of a few large general purpose languages we have a multitude of niche languages that have better performance characteristics, better tooling and smaller individual learning curves.

I won't repeat the other comments I made, but in summary, yes bash is niche, stuck in a local optimum, no sql is not small anymore.
And even if you do invest in proper documentation and support, you still have to overcome the hurdle that people just _don’t want to spend time learning your one-off language_ - there’s nontrivial opportunity cost in learning something that won’t be useful anywhere else. So people will just do the bare minimum which will lead to misunderstanding and bugs.
> you still have to overcome the hurdle that people just _don’t want to spend time learning your one-off language_

That's an important point. Maybe as an academic someone is more inclined in learning new languages for the sake of intellectual interest, but on the engineering side, having uniformity of language is a big plus.

I ask myself, by the way, if the author misses the point not considering that all code that the programmer writes is translated in machine language / byte code of elementary instructions: those instructions are the primitive language. But the programmer uses a more elevated language as he wants something more expressive.

Academics are not so hot on DSLs as one may think. Typically, they are viewed as padding material to the real research contribution. Anecdotally, I can recall papers being bashed because of the use of a DSL, but not applauded because of it.
> people just _don’t want to spend time learning your one-off language_

Which people though? If you make a DSL that non programmers in your organization use, I'm sure they will appreciate not having to learn the intricacies of Rust or whatever's in fashion this week.

We don't use DSL per se, but a custom tool for writing QA tests, which looks like a kinda Visio block diagram software, only each block is a function or other logical entity. Anyway, after a few years struggling with it, for many different reasons, we are slowly and painfully migrating to writing tests in Python, and every single QA supports it.

Custom languages, with limited support, limited community, limited extendability etc. are just like that - limited. And as soon as you hit a wall with them, transition will cost more (in both time and money) than saved in the first place by using "easier" tooling for non-programmers.

And you probably are going to hit a wall, because human desires expand. Your program does X? That was great, when you wrote it. But now, can you make it do Y? How about Z? Can you integrate it with system W? What do you mean, your little language doesn't support that?

While they are arguably "little languages", shells don't have this problem, because they allow you to invoke any program written in any language, which is an infinite-sized escape hatch for this issue. SQL kind of doesn't have this problem, because it has stored procedures (and also because people don't usually expect general computation from SQL). So SQL and shells are both "little" in some sense, but very much not little in others. Any other small language must also have some similar escape hatch, or it will trap you.

Digression: Reading the comments, SQL and shells keep coming up as the examples of "little languages". But SQL, for all its power, is not "the future". It's going to be part of the future, but it's sure not going to replace everything else. Neither are shells. And I don't see many other examples coming up. This doesn't sell me on the article's claim.

> a custom tool for writing QA tests ... we are slowly and painfully migrating to writing tests in Python, and every single QA supports it

QA is a programming related activity. These aren't the non programmers you are looking for.

I'm thinking more of shops that aren't pure software dev. Where you have specialists in <whatever the company does> that could use writing some automation themselves but don't have the time or inclination to learn all the modern meta-meta-programming stuff. 30 years ago they would have written some quickie BASIC for their formulas but now the software is based on Rust and C++ 2025 and they don't have time for that.

Basically programming is best handling by ... programming languages. However a domain that's not programming can be handled by a DSL.

> which looks like a kinda Visio block diagram software

But in this case there's your problem right there. That's not a DSL it's a visual code generation tool. Can you think of even one tool like that that hasn't proved itself useless?

e.g. AWK or VisiCalc.
part of me wants to argue that the interface to a complex piece of software is a language, really, and if it were a self-consciously made language, it could be a lot better in a number of ways.

but, i think you're probably right.

CMake's perfectly fine once the Stockholm sets in.
It really isn't, but it is a bit like democracy, it could be better, but given the landscape and IDE integration capabilities, it is the best the C and C++ community alongside tool vendors have agreed upon.

I certainly rather use CMake, even if I need an open book on the side, than Gradle, Blaze, autotools, yet another Python based build tool,....

However, most of the times, since my use of C++ is related to personal projects, IDE project files are more than enough, they have been serving me well for the last 30 years.

> it is the best the C and C++ community alongside tool vendors have agreed upon.

Well yeah but only because it's the only build tool that the C++ community has vaguely agreed upon.

Meson and Bazel are much better, but also much less popular.

Certainly not better in their dependency on Python and JVM, or IDE integration across Qt Creator, KDevelop, Visual Studio, Android Studio, Clion, VSCode, C++ Builder.
CMake has working Xcode support, while Meson has an unmaintained proof-of-concept hackjob that doesn't work. I really like a lot of what Meson does, but as long as I have to choose between working IDE support and a good project definition language I'm always going to choose the first.
The fact that there simply isn't a widely accepted build system just means it's not a solved problem yet.
I would add to that when people move to next job DSL dies I don't have a need for that DSL in next company. I could try to implement there - but IP rights would prevent that, getting new people on board with my ideas is just so much work that it is not useful.

That is like learning some SaaS application ins and outs you switch jobs and that specific experience is not useful at all for you.

General purpose language on the other hand is useful even if you move from one country to another and take job in different business niche.

As a developer there is no upside for me to spend my time on diving into some DSL I wont use in next job.

As a business person there is no upside for me to spend my time learning DSL or specific application interface in and out that I won't use in next job or in different position.

I only have only one experience to share. Back in mid 90s, was tasked with developing a webserver that provided targeted advertising. A requirement was providing the marketing team an accessible mechanism for defining rules. Basic stuff like encoding a marketing/ad-sale team rule such as "show ad of truck if user is male, at some age group". A little scripting language was developed, nothing fancier than conditional branching was involved on the surface. And the user base immediately got it and started using it, because it was a "little language".

Sometimes a DSL is really the right solution.

Well, go with a language that makes DSL a peace of cake, like Ruby.
This could've probably just been Lua tho ?
Most of what Lua offers will not be a requirement for what the ad ops team needs to create and maintain campaigns and there will be specific requirements that are not easily expressed in Lua.
modulo binding to (extant) c++ runtime, yep.

But this was in spring of '95. Ruby released later in December of that year. Lua was first publicly released in '94. I learned about the existence of these two a few years later (Lua first, and then Ruby via RoR).

mea culpa: there was no reddit or github or HN back in '95. Usenet would have been helpful but it wasn't on my radar in those days. I was just 2 years out of architecture grad school, and not exposed to the CS communities in my student years.

Yeah back then it was much less to choose from. TCL i guess but honestly I was never a fan.

I did wrote a firewall DSL in Perl right around that time but that was just a toy of a teenager.

I'd very much like to see someone coming up with nice syntax for hierarchical finite state machines and entity components systems just like people came up with nice syntax for queries in the form of linq and nice syntax for html generation in the form of jsx.

Doing these things in vanilla syntax of general purpose programing languages is not exactly great.

And then there's the issue of the language itself:

- can you even design it properly?

- is it tested?

- is it debuggable?

- how does it integrate with the rest of your program(s)? with the rest of your system(s)?

- what's the performance, and does it matter?

- is it documented?

- who is going to maintain it 1 year from now? 5 years from now?

Not to mention the human tendencies to align with certain ideals. For example, does it use 1-based indexing like Lua and Julia? If so i just can't bring myself to use it.
I'm not extremely familiarized with Julia, but it seems to me that it would let you switch to 0-based indexing as an option.
The notions first index, current index, previous index, next index and last index and each index are all invariant under shifts of the index set... yet this is the hill people choose to die on.
> Programming tasks are vast

But little languages could be a nice interface for the non- or semi-programming tasks. Do you really want your domain experts to fiddle with the core of your application or do you want your programmers to do that? A little language could be a great interface to encode specific business rules and domain logic.

The author gives SQL as an example of a little language and we do indeed already provide SQL interfaces to analysts and let them do their thing.

> Do you really want your domain experts to fiddle with the core of your application or do you want your programmers to do that?

The Curse Of Almost: Your tool is great, it's almost perfect... except for that one little thing it can't do, which your users need to do, which, therefore, leads to masses of ugly hacks unless you provide access to an escape hatch where sufficiently motivated experts can drop down to a real language which doesn't have your DSL's limitations and get the job done.

It's the Curse Of Almost because, if it were too much worse of a fit for the problem, nobody would even think of using it to solve that problem. Getting someone 90% there and crapping out puts your users in a more awkward position, especially if they feel they've invested effort in whatever tool they have.

An example is Talend versus CSV: Talend is an ETL Solution which Extracts data from some source, Transforms it according to a graphical DAG of ideally stateless components, and Loads it into some other storage. It's also a happy, friendly GUI on top of Java, which is nice, because the Real World isn't kind to happy, friendly GUI solutions which expect CSV is going to conform to any of your syntax rules or other misguided preconceptions about files having structure. So, when you have to run a Talend pipeline on vaguely-comma-delimited text files which may once have been machine-readable, you can make your own component which is literally just a block of Java code to parse the file using the Zerg Rush Of Ad-Hoc Rules Technique, an oft-overlooked method for designing parsers. You can also use that kind of thing to make components which are tasteless enough to demand state variables other than the stereotyped kind Talend itself provides.

It seems entirely possible to create a tool for making little languages that also supports interfaces for tooling and documentation. Tooling is actually quite abysmal for general purpose languages and as the article points out the tooling for little languages can be much more powerful when there’s a smaller surface area. Also we could build languages that are primarily geared around tooling and documentation instead of languages designed around different manners of defining functions and iterating over lists.

I also don’t think that anyone has ever suggested that making a custom language is a small endeavor.

Whatever the future of programming languages it will definitively not be popular at first and negative criticisms will be the top-rated commentary. And when the new paradigm comes I can almost guarantee that the majority of the HN crowd will be too old and set in its ways to make the transition. Why would the future be any different than the past with regards to paradigms shifts?

That's not the way to see the process. We have been highly successful at little languages already: they are, in essence, why when I write something like "a = a + 1" I can assume it works identically in C, Javascript, and Python. (Semantically, it doesn't! But it is a portable intent.)

You might object and say, "but variable assignment and addition, that's a big language thing." It isn't, though; it's just an infix expression. And infix didn't pop out of nowhere; it had to be invented as part of the gradual creep upwards from machine level "coding" into a more abstract semantics. Infix parsers are small, and while the complete language is larger, what it's presenting is infix-compatible. "Regex" is the same way: there is a general definition of regular expressions, and then there are some common variants of regex, the implemented semantics.

The boundary between "the language needs its own compiler and runtime support" and "the language exists as an API call you pass a string into, which compiles into data structures visible to the host language" is a fluid one. And the most reasonable way of making little languages involves seeing the pattern you're making in your host and "harvesting" it. In the previous eras, there were severe performance penalties to trying to bootstrap in this way, and so generating a binary was essential to success. But nowadays, it's another form of glue overhead. If you define syntactic boundaries on your glue, it actually becomes easier to deal with over time.

Documentation-wise, it's the same: if the language is sufficiently small, it feels like documentation for a library, not a language.

There are some modern success stories like LINQ or JSX.
To be clear, I'm not at all opposed to DSLs. I just think that creating a useful one is much more difficult and expensive than is typically acknowledged in these discussions. Creating a new DSL is probably not the first solution you should reach for before trying alternatives.
or CSS (and every one of it's derivatives), GraphQL, SQL, regex. Maybe I'm misinterpreting, but each one is a language and something I'm currently using or used in the past. Little languages are everywhere.
JSX isn’t a separate language though. It’s recursively inline html/js.
Yeah, but it's not really HTML nor JS. It's a DSL which is a superset of JS for generating HTML that can use its own syntax in its expressions.
Yes, to add to your point: nobody has managed to used the "outputs" of the STEPS project to do something useful.

There was a cool "wordprocessor-like" (Franken?) demonstrated which was created with a small number of lines, it should be a huge success in the FOSS world no? Well no, nobody managed to make it work.

In addition, “little” languages tend to eventually become turing-complete, because you keep needing that little extra bit of functionality.

And then you want to modularize your code because it becomes to big, and you want to create libraries for code reuse.

You end up wanting static typing for the usual reasons, which eventually leads to needing parametric types and recursively defined types, and the type system becoming turing-complete as well.

Or you keep working around the limitations of the little language, writing code generators and wrapping it in general-purpose language APIs.

Would you (mis)use c to do text processing, or would you use shell tools?

I suppose all this leads me to the suspicion that little languages fill in for shortcomings in big languages. Big languages can absorb the things that work, this negating the need for small languages in that sphere.

Although how far can that go? Can we keep making ever bigger languages? Or at some point does it crumble under its own weight?

Regarding CMake’s horrific documentation: I will literally be willing to pay money for someone who can show me how/if it’s possible to wire in a different language to CMake! I believe it’s possible, I’ve seen some functions deep in the crappy docs that make it look like it is, but I cannot for the life of me work it out. The language in question produces C object files!
CMake is pretty simple internally. Every "keyword" is a function call. Every function call is implemented as a class. The simplest way to hijack it would be to have a function that calls out to your external interpreter/tool with the state you want. I could see doing that in a couple of ways:

* Add a builtin command [1] that takes a string or filename and calls the interpreter with any additional data you want to pass.

* Add a flow control command [1] that passes the inline block to the interpreter of your choice. You'd probably have to override cmFunctionBlocker as well for this.

Note that this can't fix the deep design issues in CMake like the insane string representation.

And no, I'm definitely not in therapy from CMake-induced PTSD.

[1] https://github.com/Kitware/CMake/blob/master/Source/cmComman...

Champion! That’s exactly what I’ve been looking for haha

I’m trying to remove the need for external commands to compile Nim when used with ESP-IDF for embedded firmware development, which is dependent on CMake.

Going to take a crack at this today :)

Have you replaced writing C code for ESP with writing Nim and compiling it to C instead?
That’s exactly what we’re doing, yeah. --compileOnly gets pretty far, and I’d love to remove the need for having to run that compilation step separately before CMake builds the firmware from the generated C sources
Excellent, I've been hearing good things about Nim. I'm eager to try it out soon for a embedded project.
I think EDSLs are the happy medium here.
That if you do it properly and not just amalgamate few "known" languages into some unholy monstrosity. I'm looking at you Ansible.

Generating data files via templating languages was never a good idea

Using data languages as essentially code is also similarly bad idea.

Ansible does both at once.

It's much easier if your first step is to write a grammar for your DSL.