Hacker News new | ask | show | jobs
by kmstout 485 days ago
> (Keeping in mind that traditional regexes can't balance brackets; presumably they can't properly track indentation levels either.)

You're right: Regular expressions are equivalent to finite state machines[1], which lack the infinite memory needed to handle arbitrarily nested structures [2]. If there is a depth limit, however, it is possible (but painful) to craft a regex to describe the situation. For example, suppose you have a language where angle brackets serve as grouping symbols, like parentheses usually do elsewhere [3]. Ignoring other characters, you could verify balanced brackets up to one nesting level with

  /^(<>)*$/
and two levels with

  /^(<(<[^<>]*>|[^<>])*>)*$/
Don't do this when you have better options.

---

[1] https://reindeereffect.github.io/2018/06/24/index.html

[2] As do any machines I can afford, but my money can buy a pretty good illusion.

[3] < and > are not among the typical regex metacharacters, so they make for an easier discussion.