I am not sure it is a good idea to mix such specific phonetic script ideas about diacritic marks with the behavior of the program over time. Even considering the shape, it does not align with the idea of first down a little, then up a lot.
To be sure, it's a joke. Mostly trying to joke at the expense of these excessively complicated variable names (that are only there because it's pseudocode) :)
And yeah, the chinese tone in practice does not align with the idea of "down a little up a lot" either. It depends on context...
Dunno about the OP but I'm very aware as I'm not an english speaker.
I still don't want anything as unpredictable as Unicode in my code. How many different encodings will display as the same variable name and how is the compiler supposed to decide?
If you're thinking of comments and user facing strings, the OP already excluded those.
The language and compiler & linker should reject Zalgo in identifiers, and they should reject confusable script mixes in identifiers, but otherwise they treat all equivalent strings as equivalent. To make it easier on the linker compilers should normalize all symbols to one common form (e.g., NFC).
C does allow for limited unicode in identifiers, though you need to use the \u prefix and write the code out. Compilers like clang let it work like C++ and follow TR31, though this is nonstandard.
Yes, these are the relatively recent additions being discussed here. C and C++ managed just fine for ages without them before the committees decided that scoring brownie points with performative changes was more important than security and readability of source files.