Hacker News new | ask | show | jobs
by NathanKP 362 days ago
I think there is a lot of overlap between DSL's and frameworks, and most frameworks contain some form of DSL in them.

What matters most of all is whether the DSL is written in semantically meaningful tokens. Two extremes as examples:

Regex is a DSL that is not written in tokens that have inherent semantic meaning. LLM's can only understand Regex by virtue of the fact that it has been around for a long time and there are millions of examples for the LLM to work from. And even then LLM's still struggle with reading and writing Regex.

Tailwind is an example of a DSL is that is very semantically rich. When an LLM sees: `class="text-3xl font-bold underline"` it pretty much knows what that means out of the box, just like a human does.

Basically, a fresh new DSL can succeed much faster if it is closer to Tailwind than it is to Regex. The other side of DSL's is that they tend to be concise, and that can actually be a great thing for LLM's: more concise, equals less tokens, equals faster coding agents and faster responses from prompts. But too much conciseness (in the manner of Regex), leads to semantically confusing syntax, and then LLM's struggle.

2 comments

Knowing just what's going on in the existing text isn't the whole problem in navigating a DSL. You have to be able to predict new things based on the patterns in existing text.

Let's say you want to generate differently sized text here. An LLM will have ingested lots of text talking about clothing size and tailwind text sizes vaguely follow that pattern. Maybe it generates text-medium as a guess instead of the irregular text-base, or extends the numeric pattern down into text-2xs.

> there is a lot of overlap between DSL's and frameworks

Not just frameworks, but libraries also. Interacting with some of the most expressive libraries is often akin to working with a DSL.

In fact, the paradigms of some libraries required such expressiveness that they spawned their own in-language DSLs, like JSX for React, or LINQ expressions in C#. These are arguably the most successful DSLs out there.

Embedded DSLs have their own challenge, since the LLM can easily move out of the DSL into the host language in ways that aren’t valid for the eDSL. You really need to narrow the focus with more context to get anything useful out of it in my experience.