Hacker News new | ask | show | jobs
by starfallg 1596 days ago
It seems that we need another layer to tokenize according to context. I can see that breaking up a long number into 3 or 4 digits is the correct behaviour if we are dealing with phone numbers, but it'd be completely wrong if it's nearly anything else.