|
|
|
|
|
by ianbicking
1620 days ago
|
|
When you describe it that way it reminds me of SAX [1] – I always hated SAX, but eventually realized it was kind of a tokenizer that left it up to the developer to figure out how to turn that into a compiler, though in this case compiling XML input into some internal data structure or action. [1] https://en.wikipedia.org/wiki/Simple_API_for_XML |
|
At BitFlash, one of the things we had to build was a SAX parser for the SVG DOM. I used the DSL of the W3C spec to compile the SAX parser.
One of the more strange things I did was in some contexts (think old school Blackberry) we had a server to pre-parse the SVG so we "knew" it was clean (I'm still sceptical 20 years later this is ever a good strategy, but take it as a given). Because we knew the SVG was clean, there was a faster way to parse the XML than reading the tokens.
I used my magic Perl script transformer to compute the lowest entropy decision tree to identify a token with the fewest comparisons, which was surprisingly way more efficient than a trie.