|
|
|
|
|
by ww520
760 days ago
|
|
I've seen a lexer/parser scheme that encodes the lexer token type along with the token file location information into a u64 integer, something like struct Token {
token_type: u8,
type_info: u8,
token_start: u32, // offset into the source file.
token_len: u16
}
It's blazing fast. The lexer/parser can process millions of lines per second. The textual information is included, and the location information is included. |
|
It's been developed for embedded systems (it was written originally for a NATS implementation in the Zephyr RTOS), so it's a bit limited and there's no easy way to know where some parsing/type validation error happened, but the information is there if one wants to obtain it: https://github.com/lpereira/lwan/blob/master/src/samples/tec...