Surprising to see this article. I am a CS student, and at the second year, in OS course in one the assigment, we are actually building Shell in C. Very simplistic one. Great to read.
Maybe not. I attended Florida Atlantic University and that was a project I did. The shell wasn't that much, just run commands, support redirection and pipes, and (I think) handle environment variables (for example, "ls $HOME").
I have personally tried to build one in C but the parsing was the real pain, I managed to have a tokenizer, barely found how to make an AST and never figured out what to do with. All parsing tutorials are about parsing mathematical expressions, I found it hard to adapt to shell grammar.
Yes a huge part of shell is parsing, and C is a bad language for that.
If you want POSIX shell you'll have at least 5K lines of parsing code; if you want bash it's at least 10K lines. It's closer to 20K lines of C in bash itself.
There's really no way around that, and IMO the best answer is to use a different language -- which is ALSO hard, because many language runtimes don't support fork() or signals in the way that a shell needs.
(e.g. CPython is actually closer than say Go because it supports fork() and exec(), but even it has issues with signals, EINTR, etc.)
Parser generators aren't widely used for implementing shells (or JavaScript engines, or C/C++ compilers, for that matter). IMO they're nice for designing languages, but not necessarily implementing them.
bash is actually one of the only shells that uses yacc, and the maintainer regards it as a mistake. It uses yacc for maybe 1/4 of the language and the rest is all hand written stuff intertwined with generated code. It's pretty messy.
Parsing shell input is somewhat different than other languages because keywords are contextual. For example `if echo` and `echo if` are both legal, but `if` is only a keyword in the first example. This affects the design of the lexer.