Hacker News new | ask | show | jobs
by numeri 4 days ago
There's a large gap between making up words and an actually native text distribution. LLMs have a clear pattern, clear tells, a "feel" in English, and it's normally even more pronounced in non-English languages.

Lots of bias towards English sentence structure, idioms, etiquette, etc.

1 comments

I didn't notice any of that. Such a bias would be strange, because certainly smaller models don't have the luxury of learning grammar independently: it's still word sequences, and languages are quite well separated.