Hacker News new | ask | show | jobs
by elcritch 264 days ago
> as English is to French.

For basic grammar sure, but English has what 30-40% of its vocabulary from French? There's also a lot of influence from Latin and Greek in English as well.

Likely it's just less cross-cultural sharing from Welsh into English. We get much more exposed to more tidbits from romance languages or German in English than we do Welsh or Gaelic.

> Italo-Celtoc hypothesis

Fascinating! Something to read up on.

2 comments

Yeah correct, the French relationship with modern English is much closer because of (among other reasons) the Norman conquest that happened long after the Indo-European split and much closer to our time
> but English has what 30-40% of its vocabulary from French?

You have to be careful what you're counting when you quote figures like that. Here is your comment, but including only the words derived from French:

-----

... basic grammar sure, ............. influence ... Latin ...... just .... cultural .......... exposed† ..... Romance languages .................

exposed is unlike "normal" French-derived words in English in that it is not derived from Old French; the equivalent from Old French is expound(ed), and even there I'm not sure why we have ex- instead of es-. I might credit exposed more to Latin than French.

-----

Here's English:

-----

for xxxx xxxx xxxx, but English has what 30 to 40 xxxx of its xxxx from French? There's also a lot of xxxx from xxxx and xxxx in English as well.

Likely it's xxxx less xxxx-xxxx sharing from Welsh into English. We xxxx much more xxxx to more tidbits from xxxx xxxx or xxxx in English than we do Welsh or xxxx.

xxxx! Something to read up on.

-----

53 / 71 words (including Welsh, but not Gaelic) are native English.

(Welsh ultimately derives from the name of a Celtic tribe known to us from Roman writers. In Germanic, the name became a generic word for foreigners. I think it's fair to call it English; it was already like that in proto-Germanic. Gaelic is more recent.)

10 / 71 words, including the somewhat questionable exposed, are from French.

5 are Latin, two are Norse, and then there's Gaelic. Greek is not represented except in the -ic ending on Gaelic (or basic).

If you're listening to someone speak English, knowing French is unlikely to be worth much.

Nice observation but it just illustrates what the GP is saying: the basic grammar is English while a huge proportion of the vocabulary comes from French. If you remove the grammatical words from the English selection you made, there's hardly anything left.

> If you're listening to someone speak English, knowing French is unlikely to be worth much.

It can help a lot when learning because of the huge vocabulary overlap, e.g. more or less every word ending with -tion, you just learn to pronounce it differently

I thought this was an interesting idea.

I rated each word in the comment for how much I felt it represented grammar vs semantics (total adding to 1 for each word; ratings in increments of 0.1).

The ratings divided into 31.5 words worth of syntax and 37.5 words worth of semantics, adding up to 69 instead of 71 because I combined "a lot" and "as well" into one word each for this purpose.

French accounted for 6% of the grammar (reflecting my rating of sure and just as 90% "grammatical" each), and 22% of the semantics.

English got 91% of the grammar and 59% of the semantics. The point you might be most likely to disagree with is that I rated many prepositions as 50% semantic. (For example, to in the phrase thirty to forty got that rating, although to in get exposed to and something to read up on were rated 0% semantic.) The second point, cutting in the other direction, is that I rated all pronouns as 0% semantic; realistically they should rate a bit higher. In a better model, I'd probably like to rate them 100% grammatical and also ~30% semantic.

(The residual ~3% of grammar is the passive marker get, from Norse.)

If this is the kind of thing you enjoy, I'd be interested in your evaluation.

I'd say I'm quite sceptical about that kind of evaluative scheme because it seems to add a degree of subjectivity and arbitrariness about how things are rated.

At a first pass I'd just say that adjectives, nouns, and adverbs are "vocabulary", and everything else is grammar.

That won't work as a first pass. That gets you results like "there's also a lot of influence from French" being 2/3 semantics and 1/3 grammar†, with there holding just as much semantic content as influence does. It also disqualifies pronouns from counting as grammar at all, which is much more defensible than disqualifying semantically empty words, but not a common perspective.

I tend to take the perspective that if a foreign speaker is unlikely to have any trouble learning how to use a word correctly, that word is semantic, and otherwise, the word is grammatical.

† Assuming that the omission of verbs from your list of semantic words was a mistake. Otherwise you're up to 44% grammar. I did count "is" as being grammar, but I would certainly not extend that judgment to all verbs.

--- results ---

By your standard, English is 61% of the semantics and 91% of the grammar (if verbs have no semantics), or 62% of the semantics and 96% of the grammar (if verbs do have semantics).

French is 21% of the semantics and 6% of the grammar (if verbs have no semantics), or 20% of the semantics and 4% of the grammar (if verbs do have semantics).

I don't think much of your methodology, but it's worth noting that your overall numbers are almost identical to mine. (When verbs are meaningless; still very close but distinguishable otherwise.)

In reality, of course, many verbs such as sharing are rich in semantics, and many others such as do are more or less empty.

Oh, true, it was just a mistake to exclude verbs. Of course they should be vocabulary.

But I think of pronouns as grammatical, as well as the auxiliary particles in verb forms like "there is", "to go to", etc. So "have" and "is" can function grammatically when they're part of the verb form of another root verb, like "have been seen" and so on.

"Do" is obviously semantic when it's the main verb, e.g. "I'm doing my job" versus "I'm leaving my job". In the selection you quoted it's also playing a grammatical role which is just to point to the main verb form of the sentence, i.e. it could be replaced by repeating "get exposed to (titbits from)" without changing the meaning of the sentence.

So in "there is also a lot of influence from French", I would put "there _ also a _ of _ from _" as grammatical.

I'm sure my way is naive, but it's based I think on well-established categories. I'm not sure how linguists would distinguish grammatical words or even if they categorize based on words at all. e.g. "a lot of" as a quantifier might be completely grammatical, same as "more", "less", "thirty", etc.