Hacker News new | ask | show | jobs
by Yujf 818 days ago
Why not? Common characters are easier to type and presumbly if you are using regex on a unicode string they might include these special characters anyway so what have you gained?
1 comments

In theory yes, in practice no.

What you have gained is that the regex is now much easier to read.

> In theory yes, in practice no.

That's like "in theory we need 4 bytes to represent Unicode, but in practice 3 bytes is fine" (glances at universally-maligned utf8mb3)

It's not really an issue if the string you're matching might have those characters. It's an issue if the regex you are matching that string might need to match those characters verbatim. Which is actually pretty common with ()[]$ when you're matching phone numbers, prices etc - so you end up having to escape a lot, and regex is less readable especially if it also has to use those same characters as regex operators. On the other hand, it would be very uncommon to want to literally match, say, ⦑⦒ or ⟦⟧.
It's easy to read now.