Hacker News new | ask | show | jobs
by zarzavat 1208 days ago
There are still several languages that don't use spaces. Like Thai and Cambodian. Both of these languages are very analytic - they use short words, which makes it easier.

The actual problem without using spaces is not that humans can't read it, it's that computers can't read it (without at least complete dictionary and maybe some AI help). GNU aspell for instance does not support languages that don't use spaces.

[1] http://aspell.net/0.61/man-html/Unsupported.html

2 comments

Yeah, context is essential. We've seen plenty of examples of website names where the spaces being squashed out gives alternative meanings. Two come to mind where the last word was "exchange" and it followed a plural. No simple-minded spel chequer is going to be able to figure that out.
An example of text segmentation library for Japanese:

https://en.m.wikipedia.org/wiki/MeCab

Like you said, it does come with a dictionary to work properly.