Hacker News new | ask | show | jobs
by BugsBunnySan 3489 days ago
I tried to implement a shell in C once, just to see how to do it. Until I realized, I was basically just writting a shell language interpreter, and thus implementing a lexer and parser for that language manually (which it seems is exactly what happens in the article).

Which made the whole project boring to me, since it would've just come down to 3 well-defined, already solved problems: 1. write shell language grammar file 2.Use something like flex/bison (nowadays maybe antlr, or whatever) to generate the shell language parser 3. Implement the behaviour of the shell as reactions to parser events

At that wasn't interesting because: 1. I wanted to be POSIX compatible, the POSIX shell langauge is well defined, writing the grammar for that would've been boring 2. Boring by design, the whole point of using parser generators is making compiler/interpreter writing an easy, boring task 3. Arguably, this would've been interesting, to see how to do it. But if you already have a deep understandidg how a *nix shell does what it does, not that interesting anymore...

1 comments

A shell is indeed an interpreter for a programming language, and you do need a lexer and a parser (actually 4 parsers, and my lexer requires 13 modes).

But that's not what's happening in the article -- he is showing a simple REPL and fork() exec(), which is about as much as you can expect to do in an article that long. lsh_split_line() in no way resembles a real shell lexer, and there is no parser, since there are no programming language features like function calls, loops, and conditionals. Not to mention pipelines, subshells, and redirects.

I think you're overestimating the power of flex, bison, and ANTLR. I actually ported the POSIX shell grammar to ANTLR -- it's not usable as the basis for a shell parser. The POSIX grammar also only covers 1 of 4 sublanguages in shell, and it's a significantly smaller language than bash.

Bash uses bison/yacc, and maintainer Chet Ramey talks about how it was a mistake here:

http://www.aosabook.org/en/bash.html

My blog is basically about parsing bash, and I discovered a lot of interesting things:

http://www.oilshell.org/blog/2016/10/20.html

http://www.oilshell.org/blog/2016/11/01.html

http://www.oilshell.org/blog/2016/10/17.html

Well, very interesting indeed! Cool that you actually went through with writing a complete shell :)

I guess, I might've been too optimistic about the power of parser generators. Still, I think if someone were to start implementing a shell, I'd advise them to start with the grammer/lexer/parser part, either implementing them themselves or trying some of the generators. I think it's the right way to go about it (just wasn't what I thought I'd be doing when implementing a shell, at the time)

I did do a few things with parser generators though. Granted, nothing (yet) as complex as bash or even just POSIX, but still, they each had their own little trickinesses. E.g.:

https://github.com/BugsBunnySan/edl (ANTLR4 / Python)

https://github.com/BugsBunnySan/Phat-Agnus (YAPP / Perl)