|
|
|
|
|
by function_seven
1161 days ago
|
|
> Spanish is a lot more dense than English or French and might tokenize better. I'm no linguist, so I apologize if I'm misinterpreting this statement. My impression has always been that Spanish is less dense than English, only because in almost all cases, the Spanish version of product instructions is wordier. Look at the back of a shampoo bottle[0] and notice that the Spanish version is either longer, or a smaller font, to fit it all. [0] https://i.postimg.cc/xd2X5WJN/Ghub-Fo-N11u8jz-Pjj-RDt-W-CGA9... |
|
One area where Spanish is more dense is verb forms, because it retains most of the inflected verbs of Latin, whereas English has lost or merged together a lot of the historical Indo-European inflections. Speaking intuitively, I think it, like most Latin languages, tends to be a bit more verbose with noun phrases.