Thanks for studying this, your emphasis is noted - aside from this particular bullet from the "notable limitations," are there any other short-comings that you feel are deal breakers for you?
I don't like the YACC style of code fragments with placeholders, but the system seems well designed and it's likely to be good enough in practice.
But seriously, not being able to parse text is more than enough of a limitation for a text parsing tool. The token specification keeps close to standard regular expression, and matching Unicode text with regular expression (https://unicode.org/reports/tr18/) is a rather well researched problem with good implementations.
Thanks. It's not that UTF-8 is not on the list, it's always been on the list, it's just not there yet. Hence I felt the need to stipulate the lack of it in the manual, because of its importance.
If you're so inclined, examine rex.c, and you'll see (e.g. rex_nfa_make_ranged_trans() for example) that the engine internally works with ranges of uint32 for this very unicode reason.
The front-end regex parser and driver code, however, are not there yet, so prior to code emission, these beautiful ranges of uint32 codepoints are back-translated into rote uint8 tables. Such is the fate of wanting to ship. It'll come.