|
|
|
|
|
by jstimpfle
1063 days ago
|
|
UTF-8 was designed so that you don't have to worry about it. Supporting UTF-8 in a parser is trivial, basically just parse as if it were ASCII but don't barf on the bytes >= 128. As long as all your delimiter chars are ASCII, it just works. Errors in C are usually because of missing abstractions or the wrong approach. C gives you data layout, flow control, and functions, you can go a long long way with just that. > unbounded lookaheads If you want to require that, you get what you deserve. But implementing it is just a matter of putting a queue of tokens in front of your parser that supports look(n) separately from consume(). |
|