Hacker News new | ask | show | jobs
by maxbond 324 days ago
Most people aren't writing something as complex as SQLite, but most people aren't writing parsers either. Those writing parsers are disproportionately writing things like programming languages and language servers that are quite complex.

SQLite isn't some kind of universal template, I'm not saying people should copy it or that recursive descent is a bag choice. But empirically parser generators are used in real production systems. SQLite is unusual in that they also wrote the parser generator, but otherwise is in good company. Postgres uses Bison, for example.

Additionally, I think that Lemon was started as a personal learning project in grad school (as academic a project as it gets) and evolved into a component of what is probably the most widely deployed software system of all time shows this distinction between what is academic and what is practical isn't all that meaningful to begin with. What's academic becomes practical when the circumstances are right. Better to evaluate a technique in the context of your problem than to prematurely bin things into artificial categories.

1 comments

> Those writing parsers are disproportionately writing things like programming languages and language servers that are quite complex.

Sure, but adding the complexity of a parser generator doesn't help with that complexity in most cases.

[General purpose] programming languages are a quintessential example. Yes, a compiler or an interpreter is a very complex program. But unless your programming language needs to be parsed in multiple languages, you definitely do not need to generate the parser in many languages like SQLite does. That just adds complexity for no reason.

You can't just say "it's complex, therefore it needs a parser generator" if adding the parser generator doesn't address the complexity in any way.

I'm not saying everyone or even anyone in particular needs a parser generator. I'm saying real, widely deployed projects - as far as you can get from academic - empirically find it useful.

Creating abstractions does decrease complexity if one (or more) of the following is true:

- The abstraction generates savings in excess of it's own complexity

- The abstraction is shared by enough projects to amortize the cost of writing/maintaining it to a tolerable level

- There are additional benefits like validating your grammars are unambiguous or generating flow charts of your syntax in your documentation, amortizing the cost across different features of the same project

It's up to you as the implementer to weigh the benefits and costs. If you choose to use recursive descent, more power to you. (For what it's worth, I personally use parser combinators to split the difference between writing grammars and hand-rolling parsers. But I've used parser generators before and found them helpful.)

I agree, I just don't understand why you seem to think this is a correction to anything I said.

If your goal is simply to reduce bugs--not something more complex like generating parsers in a bunch of languages--then hand rolling a parser generator and then using it to generate your parser [singular] is not a path to achieving your goals. That's what I said, and that's actually just true, which you probably know.

This is not an invitation to bring up irrelevant, exceptional cases, it's the rule of thumb you should operate on. Put another way, don't add layers when there isn't a reason to do so. If there is a reason to do so, have at it. Obviously.

In a meta sense, it's pretty socially inept to jump in with corrections like this. In a complex field like programming, of course there are exceptions, and it's disrespectful to the group of professionals in the room to assume that they don't know about the exceptions. I'm guilty of this myself: it's because I was brought up being praised for knowing things, so I want to demonstrate that I know things. But as an adult, I had to learn that I'm not the only knowledgeable person in the room, and it's rude to assume that I am.