Hacker News new | ask | show | jobs
by soldehierro 1209 days ago
> canyoureadthiseasily?ididn'tthinksoeither.nobodywriteslikethisinenglish

This is very legible for a native english speaker

3 comments

It’s legible, but it takes a bit of active effort which makes it tiring to read more than a few sentences in a single sitting.
Spaces weren't always used. If you look at old greek/roman engravings everything is jammed together without spaces. I think you could eventually become quite good at reading spaceless text, even though spaces definitely aid in comprehension.
There are still several languages that don't use spaces. Like Thai and Cambodian. Both of these languages are very analytic - they use short words, which makes it easier.

The actual problem without using spaces is not that humans can't read it, it's that computers can't read it (without at least complete dictionary and maybe some AI help). GNU aspell for instance does not support languages that don't use spaces.

[1] http://aspell.net/0.61/man-html/Unsupported.html

Yeah, context is essential. We've seen plenty of examples of website names where the spaces being squashed out gives alternative meanings. Two come to mind where the last word was "exchange" and it followed a plural. No simple-minded spel chequer is going to be able to figure that out.
An example of text segmentation library for Japanese:

https://en.m.wikipedia.org/wiki/MeCab

Like you said, it does come with a dictionary to work properly.

It's legible (ish) due to the limited ways that English characters can group up - e.g. there's only one valid way to split up "writeslikethis". In JP many common words are only 1-2 characters long, so in general even a very short string of kana can be split up multiple valid ways.
Yeah, no problem here. I've been learning japanese for a while and have encountered the phenomenon though. Still if you look at e.g. the front page of JP wikipedia: https://ja.wikipedia.org/ there's plenty of kana there, be it words that are habitually written entirely in katakana or hiragana, or even kanji words that still have some attached kana to disambiguate readings.

After more practice I've found myself starting to pick up on common dividers, like particles, verb endings, adjective endings, etc. I assume native speakers do this instinctually, much like native English readers aren't really reading letter by letter (wichh is why txet lkie tihs is rbleadae at ntiave seepd for most)