|
|
|
|
|
by c0nstantine
491 days ago
|
|
yeah. Transducers are very old topic. For some reason they were not connected to a specific language like regex. > wrt syntax, are you sure you want ':' to bind stronger than concatenation 'ab' ? That's something I am still not sure about. I took a hundred examples and it looked more natural this way (: lower then .). But I can change it with the change of one digit in the code, literally. That's why I'm posting here. I need some real feedback. |
|
also,
* colon as member of a character class (and dash and how they interact)
* posix character classes (and the colon in the syntax)
* word boundaries are really useful in practice
* think of ünicøde early and look at what russ cox did there
boundaries, what do you decide to exclude? for example back references and grouping have fun with DFAs (again see russ cox and re2)
composition is fantastic, and a shortcut to exponential land because grammars (as in Chomsky hierarchy) can easily be expressed with transducers, yay.
boundaries will also clarify use cases and allow to state: "if you want to do xyz please use re2 and a bit of code instead"
and one "technicality" that hit me once with re2 was a static buffer for resumable iteration. I'd loved to reallocate the buffer and drop it's beginning and extend it at the end but alas, the state of re2 has hard pointers into the buffer, not relative ones. I think this was when re2 hit the end of the buffer mid-expression during incremental parse. so you can't reallocate and instead you have to drop the hard earned state and resume.
anyhow, it's been quite a while so I'm no longer in the thicket of this :-D
what's your driver? curiosity? or some itch?
but I really enjoy seeing your project :-)