|
|
|
|
|
by cscottnet
2320 days ago
|
|
It appears that you were probably describing the lexer pass in your description of docuwiki. Indeed tokenization is a very hard problem for wikitext. We use a pegjs grammar for it, but it contains less of lookahead/special conditions/novel extensions, etc. It's hard. Wikitext is messy precisely because it was intentionally designed to be easy and forgiving to write. Seems like we've learned many of the same lessons building our parsers. Markup parsers do seem to be a unique thing, not really like parsing either programming languages or natural languages. If we every meet I'm sure we could happily share a beverage of your choice trading stories. |
|