Hacker News new | ask | show | jobs
by eatonphil 1064 days ago
Yes whitespace in unicode is expansive. However, you could (and I assume most languages do) specify that a newline is \n or \r\n which are expressible in ASCII.

Maybe I'm wrong though, just an assumption about what's common.

(See for example how Go, which is Unicode aware, defines tokens: https://go.dev/ref/spec#Tokens.)