Hacker News new | ask | show | jobs
by iamnotsure 632 days ago
"Multi-script languages

Two of the languages we support, Serbian and Chinese, are written in multiple scripts. To avoid spurious differences in word frequencies, we automatically transliterate the characters in these languages when looking up their words.

Serbian text written in Cyrillic letters is automatically converted to Latin letters, using standard Serbian transliteration, when the requested language is sr or sh."

I'd support keeping both scripts (српска ћирилица and latin script) , similarly to hiragana (ひらがな) and katakana (カタカナ) in Japanese.

1 comments

Why is this a HN comment on a thread about it ending due to AI pollution?