Hacker News new | ask | show | jobs
by tgv 1 day ago
I didn't notice any of that. Such a bias would be strange, because certainly smaller models don't have the luxury of learning grammar independently: it's still word sequences, and languages are quite well separated.