Hacker News new | ask | show | jobs
by triyambakam 494 days ago
Can someone smarter than me explain how it's even possible to use DNA to identify the origin of a language, given that e.g. if this were tried with a language like German (or maybe any Western European language) the puzzle would look very confusing and is not DNA based.
3 comments

The story with the Indo-Europeans is basically as follows:

1. By intersecting ancient word sets of ancient Indo-European languages using comparative phonetics we can try and reconstruct the words of the proto-IE language, both their approximate sounds and approximate meanings. This gives us some information about the society. E.g., the PIE language very likely had a word for wheel, which puts the common PIE community in the period after the wheel was invented. Other words can help us guess what landscape the PIE people lived in, and it has been generally assumed for almost a century now that it strongly resembles Southeastern Europe, essentially the Ukrainian steppe. Two alternative hypotheses (modern-day Turkey and the area to the north, in modern-day Poland/Ukraine) had different drawbacks. We can also look at the locations of the earliest historically attested IE groups (Europe, Middle East, Punjab, Anatolia) and try and guess where they all may have had come from, given the time frame.

2. By looking at the descriptions of the earliest IE societies (first of all the society of Rig-Veda), we can try and guess what way of life these people had. We can then look at all the archaeological cultures in the roughly appropriate area from the roughly appropriate time frame and see which of those have features of interest (in the IE case, warrior-like culture with social stratification, etc.).

3. We know that IE migrated a lot and provided a lot of genetic material to modern populations in Europe and some other regions. Since quite recently, by looking at palaeo-DNA data from the remains of the people who belonged to these cultures, we can try and check who of them made the biggest contribution to contemporary populations.

All these sources of data are rather imprecise, but if you combine them all together and see a clear pattern, this looks rather convincing.

> the society of Rig-Veda

I fail to understand how the Rigvedic society can be connected to this DNA research. Rigveda never mentions anything beyond the Punjab/Swat/Haryana region in any of the hymns. The flora and fauna mentioned in it is also exclusive to this region. Lastly there is no mention of an ancient homeland both in Rigveda and Avesta.

I believe there's some stuff around burial practices that parallels some steppe practices. Something about horses and mound construction, I think?

Here we go: https://www.discovermagazine.com/planet-earth/chariot-racers... - make of that what you will.

While I don’t mind if they’re related, the evidence is rather thin. Interestingly, chariots and royal burials were also found in Sinauli, India which provide an interesting alternative to this theory.

https://www.cambridge.org/core/journals/radiocarbon/article/...

Do you believe in out-of-india theory for IE or are you just sceptical about the use of the Rig Veda specifically.
It gets a bit silly when you start using archaeology to prop up modern political doctrines. Humans left Africa over 100k years ago, and groups have been moving around ever since. Whether a group moved from the Caucasus to South Asia or vice versa, around 5k years ago, shouldn't really matter. Perhaps they had moved in the opposite direction 10k years ago. Obviously, we all have human ancestors who were living 5k, 10k, 100k, 200k years ago.
Option#2: I was only curious about the GPs claim which added Rigveda to the mix.
Its heavily contested if these were chariots. If anything, I would suggest that the consensus scholarly opinion is that these were ox drawn carts, not chariots.

- no horse remains or equestrian objects have been found, anywhere in India for this time period

- solid wooden wheels (shown in the reconstruction) are too heavy for horses to draw, for which spoked wheels were developed in the Steppe

- the shape of the yoke that would be tied to the animals is straight, the way ox carts have, like Harrapan ox carts. By contrast, yokes for horses are curved, to match the animal's posture.

I think this comment is based on some confusion about how languages spread. Languages spread along with people, but while a local language may be replaced, the people are not generally replaced with the language. There may have been some genetic mixture, there may have been a time where they were conquered by them for a time, but there's no sense in which the people who wrote those works _were_ Yamnayan, any more than the Germans are. They wouldn't have a story about having a far away homeland because they wouldn't have had a far away homeland, and nobody would have remembered any previous language because that language had been replaced thousands of years before, and well before anybody started writing anything down. They gradually picked up the language of either invaders or their trading partners, just as has happened many other times in history.

Edited to add: there are basically no migration stories in _any_ indo-european mythological cycles or oral traditions. That's not evidence that there wasn't spread through, migration or invasion, but it does indicate that it was a gradual process that wouldn't have been particularly noticeable in any one life time.

All the recent palaeo-DNA data suggest a horribly massive process of genetic replacement of the local population by the new arrivals. This process is of course very uneven -- e.g., the population of Ireland seems to have mostly shifted to a new IE language -- but in some cases the change was drastic. Moreover, in some parts of Europe this seems to have happened several times, with first agriculturalists replacing local hunter-gatherer populations and then IE people replacing them in turn.

The problem of IE is of course very abstract, while the problem of, e.g., Celts is much more concretely paradoxical (continental and island Celts share the language family but not a lot of archaeology and a dubious amount of genes). However, it is still a more or less commonly accepted fact that at some point in the past PIE peoples spread like wildfire, bringing their dialects, genes, and culture to a very large area, and it is of huge historical interest to know where they started from.

The fact the IE epic and mythological traditions have zero memories of all this, I would say, is interesting but does not prove or disprove anything.

The Rig Veda is only 3000–3500 years old, contrary to folk traditions holding it to be much older. The Yamnaya culture is 5300 years old and only lasted 700 years. When the oldest parts of the Rig Veda were composed (and they are, incidentally, about the proper way to praise the gods, not about historical events) the Yamnaya culture had died about 1100 years ago. Those 1100 years included a lot of warfare, mostly nomads living in tents, without writing.

How much do English-speakers today know about the events in early 10th century France that eventually led to English becoming a sort of pidgin French, full of words like "eventually" and "sort" that didn't exist in Beowulf? How much effort do they typically devote to passing on traditions about Æthelwold's challenge to Edward the Elder in Wessex?

And that's after 1100 years of a literate, mostly settled culture with libraries that contain physical books from that time, in a culture that values that kind of factual knowledge of history, rather than more practical sorts of knowledge such as how to properly worship Agni to gain his favor and which plants to poison your arrows with.

Oral tradition can preserve knowledge to an astounding degree. There are songlines, as I understand it, that record the geography of landforms that have been undersea since the Ice Age (https://www.scientificamerican.com/article/ancient-indigenou... roughly the same time as the Proto-Indo-European culture). But it is hardly surprising when it is silent on a topic we wish we knew more about.

I think it’s not so much that the Rigveda by itself gives us a direct insight into Proto-Indo-European culture, but rather that if we compare it to Western texts it can help us reconstruct elements of a shared ancestral culture, or at least a shared ancestral language (from which we can perhaps infer something about culture).
The actual surviving texts are even less than 2000 years old. one just beliefs that the oral tradition was written down pretty unaltered but that's questionable in my opinion
> events in early 10th century France

Were there such events?

> How much effort do

Not a lot. Since they don’t need to because of writing. As far as we can tell non-literate societies put in massively more effort into preserving oral traditions.

Of course it’s debatable but there is some evidence that oral knowledge can be preserved for thousands of years.

> there are basically no migration stories

Irish have migration myths.

So do Greeks (probably a bit more localized intra-Balkan movement, though).

To be fair IE migrations were very long ago. It’s not inconceivable that oral myths might have been preserved for several thousand years and yet we might know nothing about them.

> wouldn't have been particularly noticeable in any one life time

Probably not true. At least genetic evidence points otherwise. IIRC we’ve found individuals as far as Britain who were closely related (a couple of generations) with remains found in the steppes. At least some elite groups were very closely related paternally and moved very fast across Europe.

PIE reconstructions are very interesting peaces of linguistic, but they seems often mistaken. One great analogy, I first saw presented in some Linguisticae[1] video I think, is "what if we had no direct trace of Latin and we were looking to recreate proto-Romance roots." Of course Latin itself refers to very wide set of linguistic practices, with all the diversity we can imagine through time, space, individuals and even for a given individual there are difference as they age and depending of context they will use different sociolects and language register, plus of course not everyone is mono-linguistic.

[1] https://www.youtube.com/@Linguisticae

That's what I wonder, whether there has been any blind backtesting of the methodology itself to see how reliable it even is. Reconstructed proto-languages tend to be overly complex and unnatural.
There have been attempts to recreate (vulgar) Latin from modern day Romance languages, as well as using older forms of these languages to reconstrct what's known as Proto-Romance.

My recollection is that the complexity went the other way; Latin was more complex than the reconstructed languages, especially if the reconstruction didn't include Romanian, because the modern Romance languages became simpler over time in similar ways.

It's clear that the result is useful for understanding features of the ancestral language, but it's not perfect, and never will be.

On the other hand, comparative linguistics came long before genetics, and it is this field that first noticed a connection between the Indo-European languages.

Archaeological and especially genetic evidence now show the peoples of this language family (mostly) have shared (though distant and diluted) ancestry, so the field was broadly correct in noticing a connection.

It's not about the origin of a single language.

It's about the origin of a population whose widely dispersed descendants often speak a language whose primary features descend from the language spoken by the original population (albeit changed via thousands of years of drift and borrowing from other languages).

That doesn't mean that a) all features of the descendant language come from the origin language or b) all speakers of the descendant language have ancestry from the original population.

Writings on artifacts and burial practices associated with DNA fragments found at the burial sites.
This study is about prehistoric Steppe peoples, there are no Indo-European inscriptions from this time period nor would there be any until several millennia after this time.
> there are no Indo-European inscriptions from this time period nor would there be any until several millennia after this time

That's a very negative presumptions.

How about the oldest attestation of Indo-European language or the long extinct language Hittite who once lived in Bronze age Anatolian Steppe? The language is attested in cuneiform, in records dating from the 17th to the 13th centuries BCE.

Hittite people created an empire centred on Hattusa, and also around northern Levant and Upper Mesopotamia [1].

[1] Hittite language:

https://en.wikipedia.org/wiki/Hittite_language

That is about 2500 years after the period we're discussing, and in a region conventionally considered to be on a different continent. It isn't a mere presumption that the Kurgan culture didn't have writing; archaeologists have been looking for it diligently for more than a century and have found extensive collections of well-preserved grave goods, but no writing. Writing was invented about 1000 years later in Sumeria, probably in Egypt, and possibly in South America, but not in the Lower Volga homeland of the Proto-Indo-Europeans. (The North American and Chinese inventions of writing seem to have been independent, but were another 2000 years later still.)

The Hittites adopted the Sumerian form of writing; they did not bring a writing system with them from the Volga. Neither did other Indo-European groups have writing, which is why Hittite is, as you say, the oldest attested Indo-European language.

The Hittite documents, besides recording several Indo-European languages from the same (Anatolian) branch of the Indo-European language family, also record some fragments from an Indic language, making that the older attestation from another Indo-European branch than that of the Hittites. (The next attested Indo-European branch is Mycenaean Greek).

That Indic language was the language of some group of people who at some point in time, perhaps after a war victory, had become the main members of the elites who ruled Mitanni, a Southern neighbor of the Hittites, located mostly in present Syria, where most inhabitants were speaking Hurrian, a non-Indo-European language.

Those Indic-speaking people were renowned as expert horse trainers, so the quotes from their language were encountered in Hittite documents about horse training.

Most known data is consistent with an older migration towards South Asia of the people speaking Indic languages, who had gone both towards East, reaching India, and towards West, reaching as far as Syria, where they entered in contact with the Hittites and other related populations, who had migrated towards South at an even earlier date and through a different path, reaching present Turkey.

The Indic migration has been followed much later by a migration on the same path of people speaking the closely related Iranian languages, who have reached the present territories of Iran, Afghanistan, Tadjikistan, forming the ancient Persian empires, after various conquests.

The people whom we now name Hittites used another name for themselves, and they called Hittites a non-Indo-European population, who were the former inhabitants of the territory ruled by what we call Hittites.

There is some evidence of at least proto-writing existing in the “Old” European societies that the Indo-Europeans replaced prior to 3500 BC. Of course no indication that it was preserved or further developed.

https://en.m.wikipedia.org/wiki/Vin%C4%8Da_symbols

It should be noted that there is a very great difference between "proto-writing" and writing.

It is likely that various kinds of "proto-writing" have been independently invented in a lot of places, but very few of them have evolved into writing systems.

"Proto-writing" is just a set of graphic symbols that are used to designate various things. Such a set of symbols can be used e.g. to write an inventory, to tag things to show ownership or purpose, to show on a map what can be found in certain places, and so on.

"Proto-writing" cannot be used to write human speech. All systems of "proto-writing" that have evolved into writing systems have done that by reinterpreting a part of the graphic symbols, or sometimes even all of them, to no longer be the names of some things, but to have a phonetic meaning, i.e. to represent some sounds of human speech (syllables in almost all cases), allowing thus the writing of the more abstract components of the speech, like various grammatical markers.

Therefore for a system of "proto-writing", it does not make sense to ask which is the language that has been written with it, because there exists no such language.

The only kind of information that can be known about a system of proto-writing is which is the thing denoted by each symbol. Even when the meanings of all symbols are known, that does not offer any information about the language used by those who have invented and used that system of proto-writing.

For now, there is no evidence that the Indus script was a writing system, because only very short strings of symbols have been preserved. It could have been a writing system, because by that time other writing systems already existed not far away, which could have inspired them, or it could have been just a proto-writing system, which would give no clue about the language of its users.

The Proto-Indo-European language is usually dated to something like 6000 years ago, well before any writing.
So? What about the Hittites? There is a slight gap between 1700 BC and 4500 BC.
Prior to the discovery of the Hittite language, linguists had compared the various Indo-European languages they knew of and did much of the work of reconstructing the Proto-Indo-European language based on comparative linguistics. This work was highly conjectural, but it provided something akin to a falsifiable theory that could be tested by the discovery of another written Indo-European language. Such a language was Hittite, and the Hittite language fits the model of Indo-European languages that had been constructed prior to its discovery.