| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Yujf 818 days ago
	Why not? Common characters are easier to type and presumbly if you are using regex on a unicode string they might include these special characters anyway so what have you gained?

1 comments

amelius 817 days ago

In theory yes, in practice no.

What you have gained is that the regex is now much easier to read.

link

LK5ZJwMwgBbHuVI 817 days ago

> In theory yes, in practice no.

That's like "in theory we need 4 bytes to represent Unicode, but in practice 3 bytes is fine" (glances at universally-maligned utf8mb3)

link

int_19h 817 days ago

It's not really an issue if the string you're matching might have those characters. It's an issue if the regex you are matching that string might need to match those characters verbatim. Which is actually pretty common with ()[]$ when you're matching phone numbers, prices etc - so you end up having to escape a lot, and regex is less readable especially if it also has to use those same characters as regex operators. On the other hand, it would be very uncommon to want to literally match, say, ⦑⦒ or ⟦⟧.

link

knome 817 days ago

It's easy to read now.

link