Hacker News new | ask | show | jobs
by jrabone 4779 days ago
Not true for Java. Docs even say:

  \d         A digit: [0-9]
  \p{Digit}  A decimal digit: [0-9]
which is actually somewhat depressing. I'd expect the named class to include the full Unicode digit set. It's surprising to see:

  ab1234567890cd matched 1234567890
  ab𝟣𝟤𝟥𝟦𝟧𝟨𝟩𝟪𝟫𝟢cd no match
from code using Pattern.compile("(\\p{Digit}+)");

EDIT: and perhaps more surprising to see in the logs:

  Exception in thread "main" java.lang.NumberFormatException: For input string: "𝟤𝟥𝟦𝟧"
  	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
  	at java.lang.Integer.parseInt(Integer.java:449)
That'll keep someone guessing for a while...