Hacker News new | ask | show | jobs
by teleforce 490 days ago
Related HN posts [1], [2].

Fun facts, the most common words of Indo-European Family are surprisingly very similar across Sanskrit (S) <--> English (E) <--> German (G) [3].

Pitara (S) <--> Father (E) <--> Vater (G)

Matara (S) <--> Mother (E) <--> Mutter (G)

Bhratara (S) <--> Brother (E) <--> Bruder (G)

Duhitar (S) <--> Daughter (E) <--> Tochter (G)

[1] New insights into the origin of the Indo-European languages (147 comments):

https://news.ycombinator.com/item?id=36930321

[2] Ancient genomes provide final word in Indo-European linguistic origins (16 comments):

https://news.ycombinator.com/item?id=42515584

[3] Turandot and the Deep Indo-European Roots of “Daughter” (15 comments):

https://news.ycombinator.com/item?id=29450507

8 comments

My dad has literally just published a book (in Russian) with about 850 words with near identical sound and meanings in Russian and other Slavonic languages. :)

https://borissoff.wordpress.com/2025/02/06/russian-sanskrit-...

For my part I built the web based editing tool, DB and LaTeX generation system that he used to assemble this massive undertaking over the years. :)

https://borissoff.wordpress.com/2015/10/30/first-public-pres...

It was interesting hearing him talk about how you can see pieces of the original proto language preserved in the different languages. E.g. Russian has 6 cases, Sanskrit has some of these but also others and the original language had something like 12 (I don’t have any particular knowledge on the subject so might be misremembering).

For me it was interesting that the original language seemed to be more complex than the modern descendants, like there is a general trend towards simplification with time. In my mind then there is the question as to where the original complex language came from and why would a culture that we would consider more primitive that ours would need and come up with one.

The complexity of natural human languages comes in different forms, but as a general rule, whenever you see something that's built into another language and "missing" from your own, you can express it by using more words. For example, PIE had a lot of noun cases that aren't in English, but you don't need the instrumental case to precisely express its purpose. You can say something like "by means of a forklift."

Some studies actually suggest that literacy systematically pressures languages to use longer, more complex sentences, thus disincentivizing complex inflection rules.

I get that part - I speak both English and Russian and the latter is more concise and nuanced due to the more complex grammar.

It’s just interesting that the apparent trend is from complexity to simplification, like what I observed with English as grammar is not taught so much here in England anymore. It could well be (and likely is) an illusion stemming from my shallow knowledge of the subject of linguistics.

When I was learning Spanish in Central America, I met people there leaning English. As we would help each other learn, they always commented how lucky I was to be learning Spanish because all the tenses and general regularity made it easy to learn, but they thought English was so difficult to learn because of the seeming lack of rules and regularity.

In some regards English is simpler, but in other ways it is more complex in order to compensate for what’s lost in simplification elsewhere. English is simplified morphologically, but word order does a lot of heavy lifting instead, and it’s often apparent when speaking to someone who hasn’t yet mastered the language.

There is a relevancy bias here. From the perspective of a highly literate society we see fewer grammar rules as simpler. But is it, really? It is substituting one complexity for another. English has fewer noun cases, but a multitude of prepositional phrases that are really hard to keep straight.

The grammar of language tends swing back and forth on these factors, perhaps some guided by literacy and the rest a random walk, and what is “simpler” to us might be a subjective statement based on what we speak now.

That makes sense and “simpler” is probably not the right word to have used, as that is too positive. :)

To fall back on the reliable technology principle: it depends on your use case.

More concise and nuanced but interestingly with a lower information density.
>built into another language and "missing" from your own, you can express it by using more words. ... "by means of a forklift."

and that "more words" combination may be more precise, expressive and much simpler to handle in communication in some contexts (not necessary in all though) than say something like <prefix><word root><suffix 1><suffix2> with <suffix>-es being "juschij" and the likes (my past comment on that https://news.ycombinator.com/item?id=40244902 )

An example: "Petr kicked Ivan" and "Ivan kicked Petr" - 2 opposite things in English while in Russian i can use all 6 combinations of the "Petr", "kicked", "Ivan" words while still saying the same thing just by utilizing necessary suffixes to express the case, and by switching suffixes i can use the same 6 combinations to express opposite ("Ivana pnul Petr" and "Petr pnul Inava" and "Pnul Ivana Petr" and so on - all is the same thing while "Ivan pnul Petra", "Petra pnul Ivan",... is the opposite - great for writing poetry, while not that good for the contexts where concise and precise communication is at premium, like for example in the tech world)

This is an interesting and somewhat orthogonal conversation (and sadly not what HN comments are designed for).

The 3 examples you give in each case are not the same though - they have a different colour to them and would be “wrong” to use depending on the context. This is precisely the sort of nuance that I mentioned in one of the other comments and like you say it’s great for poetry but also for encoding additional context in fewer words. Incidentally, I recall my dad pointing this out as another similarity to Sanskrit.

As an example: I once spent some time trying to explain to my wife the difference between «какая-то фигня» and «фигня какая-то». Same words quite different meaning. :)

Taking it further, this difference can be used as a lens to see the fundamental difference between Western and Eastern philosophy and way of thinking but that’s a whole separate rabbit hole. (This is much more my subject of interest rather than linguistics.)

> Pitara (S) <--> Father (E) <--> Vater (G)

> Matara (S) <--> Mother (E) <--> Mutter (G)

> Bhratara (S) <--> Brother (E) <--> Bruder (G)

> Duhitar (S) <--> Daughter (E) <--> Tochter (G

Since you seem to be quoting the Sanskrit words in their root forms, (to which the case-lacking English and German equivalents most closely correspond) your spellings are incorrect. The correct forms are:

pitr

mātr

bhrātr

duhitr

No thematic 'a' on the end.

You might be confusing it with the nominative plural case forms:

pitarah

mātarah

bhrātarah

duhitarah

Thanks for the info, that makes the words even more similar to each other across three main languages of Indo-European family.
Similarities like these, especially with Latin in the mix, were the clue that originally put early linguists on the scent of the IE language family several centuries ago. Since then, extensive research has been done into how exactly these languages developed from their common ancestors. Some modern dictionaries, like Wiktionary, contain entire family trees comparing the divergent development of these cognates and many, many others.
>Pitara (S) <--> Father (E) <--> Vater (G)

>Matara (S) <--> Mother (E) <--> Mutter (G)

Also some roots of the smaller natural numbers, like (E): one, two, three, four, five, six, seven, eight, nine, ten, etc.

(G) eins, zwei, drei, ...

(S) eka, dvi, tri, ...

See the "Table" here:

https://en.m.wikipedia.org/wiki/Devanagari_numerals

Although it is about numerals, there are words in a few languages, on the right side.

And Sanskrit is the ancestor of many Indian language, such as the regional languages of most of the northern (e.g. Punjabi, Haryanvi, Himachali, Hindi and its dialects), central (e.g. Hindi), eastern (e.g. Bengali, Odiya) and western (e.g. Gujarati, Marwadi) Indian states. To a rough approximation, only the languages of the 4 (now 5, with Telangana added) southern states, and of the 6 / 7 north-eastern states (Assam, Manipur, Mizoram, Meghalaya, etc.) and maybe a few aboriginals' / forest tribals' languages, like Bhil, Gond, etc., don't descend from Sanskrit.

The numbers of one to ten across the three main Indo-European namely Sanskit, English and German just confirmed they are from the same language tree.

The same goes to Malay-Austronesian language family that is spoken in Taiwan, Malay archipelago and further away in Polynesian islands including native people of New Zealand and Hawaii, their numbers of one to ten are very similar accross very wide geographical area confirming they are from the same language tree. Fun facts their most common word is (nyior/nyiur) which further cemented their status as the community with largest number of islands because coconut tree is trademark of their islands environment.

[1] Austronesian peoples:

https://en.wikipedia.org/wiki/Austronesian_peoples

thanks for sending me into that rabbit hole :)

I was already interested in polynesia, from quite a while ago, and had read some books about it, and also a great National geographic magazine series about ancient polynesian navigators, who did not have any modern instruments, they just used knowledge and observation carried across generations, of patterns of wind, stars, ocean waves and swells, sea and land bird movements, clouds, et cetera, to navigate thousands of miles across the Pacific, to both initially discover and settle, and later travel between, multiple islands and Island groups in the Pacific.

the hokulea saga is an example.

Lots of verbs too.

For example, 'to be' - French 'etre' (circumflex over the e indicates old 's' after the e), Marathi 'asane' (pronounced esnay)

'to go', German gehen, Marathi jana (when conjugated the j becomes hard)

'to give', french 'donner', Hindi 'danaa' (pronounced similarly)

'to mix', french 'melanger', Hind 'melaanaa'

Other non-obvious ones:

Vedas and Wisdom / Wit. Alternatively, Latin video (to see)

Dyaus-pitar and Jupiter, Zeus-pater

'that' in English is 'que' (that/what) in french and 'kya' (for what) or 'ki' (for that) in Hindi (pronounced similarly to French 'que').

English burden or 'to bear' and Hindi bhar (burden)

English 'ignite', Latin 'ignis' and Indic 'agni' (fire)

'Raja' and 'regal' or 'royal'

'Dental' and Hindi 'dant' (tooth)

Greek 'polis' and Indic 'pore' / 'pur' / 'puram' (the 'r' is pronounced like a soft l)

> Dyaus-pitar and Jupiter, Zeus-pater

This one is slightly more interesting than a mere cognate as it is believed that the Proto-Indo-European speakers worshipped a sky god with the reconstructed name *Dyḗus ph₂tḗr ("sky-father") which is the ancestor of these (also Tyr and the like on the Germanic side). See:

https://en.wikipedia.org/wiki/*Dy%C4%93us "*Dyēus is considered by scholars the most securely reconstructed deity of the Indo-European pantheon, as identical formulas referring to him can be found among the subsequent Indo-European languages and myths of the Vedic Indo-Aryans, Latins, Greeks, Phrygians, Messapians, Thracians, Illyrians, Albanians and Hittites."

What I find interesting is that the primary Turkic/Mongolic deity, Tengri, is also a sky father. There’s no shared genetic or linguistic ancestry there, just two different steppe nomad populations independently deifying the daylight sky the same way.
What you are talking abyis gök/kök tengri (lit. sky god). There are other gods in Turkic/Mongolic pantheon, like ülgen/ulgan, yer tengri (earth god).
There is a connection. Not DNA, but via trade with the Saka/Scythians, who where descendants of PIE speakers
All steppe nomads are culturally descendent from Yamna.
French être is from PIE h₁ésti https://en.wiktionary.org/wiki/Reconstruction:Proto-Indo-Eur... which also gave rise to Marathi आथि (āthi). Marathi असणे (asṇe) https://en.wiktionary.org/wiki/%E0%A4%85%E0%A4%B8%E0%A4%A3%E... appears unrelated. (But might be cognate to English at home?)

Not all similarities between mondern languages are inherited, coincidences do happen.

My favorite part is that the most foundational swear words in modern Slavic languages are still recognizable from their PIE roots:

https://en.wiktionary.org/wiki/Reconstruction:Proto-Indo-Eur...

https://en.wiktionary.org/wiki/Reconstruction:Proto-Indo-Eur...

In (today’s) Persian they go something like this:

Pedar, Madar, Baradar, Dokhtar

Could you explain in non-specialist language how similarities between these modern languages now has anything to do with their relationship from some earliest common ancestor? How is that explanation better than convergent evolution or overfitting hallucinations?

When I look at the difference between modern and “old English” they seem to have changed quite a bit [0]. When I read an etymological explanation [1], it sounds like a just so story.

0. https://www.reddit.com/r/etymology/comments/9ouweu/how_engli...

1. https://www.pimsleur.com/blog/words-for-father-around-the-wo...

English is a bit special in that it's a relatively modern mix of Old English (aka Anglo-Saxon) and what the invading Normans spoke (a Romance language), plus some more. So when you compare words it's maybe better to look at the origins of the modern English words. "Ignite", for example, is from Latin "Ignitus", via the Normans. It's fine to include English when comparing words from different IE languages, but perhaps not as the only "Western" example. Wikipedia has a much broader list which is more interesting: https://en.wikipedia.org/wiki/Indo-European_vocabulary But it's not as good as I would wish. English is included as the only modern western European language. No German, no Swedish, no Icelandic, no Dutch etc.
The explanation is better if it allows you to explain a large number of similar words arising from a common source by a systematic process.

If you have to make up a new just-so story for every pair of words, of course you're not gaining much, but if the same story works for many words at the same time, positing a common origin isn't too far-fetched.