Hacker News new | ask | show | jobs
by IanCal 931 days ago
I've wondered for years how far you could get just checking perplexity. English -> internal rep, and x-> internal rep. Then mapping between the internal reps such that English -> another language has low perplexity. That is, a sensible sentence in English should result in a sensible sentence in another language.
2 comments

Some form of internal representation is crucial. Translation is a n^2 problem where some nodes like Chinese, English and Spanish have much thicker arrows, which makes traditional approaches awful for less common languages-pairs.

Aside from the lack of training data in many languages, I get the impression that tech companies like Google have been anglocentric in their approach, resulting in ok results only if at least one of the languages are “big”. That’s one thing that’s amazing about ChatGPT, it doesn’t discriminate between languages much, or, at least it seems like it’s able to transfer knowledge really well between languages. It seems it finds the higher level patterns of human knowledge to the point where language or even style is basically just a frontend.

Ironically, it seems the less you bother to teach computers about linguistics, the better they perform at language.

'perplexity'?
The wiki link is good. In the context here it's easy to picture it as how weird a sentence would sound to a native speaker. Low perplexity means what was generated would be unsurprising if you saw it in the dataset.
I'll check the wiki link, but how is perplexity different from the measure of Surprise, in terms of Shannon's stuff?