|
|
|
|
|
by liliumregale
1494 days ago
|
|
This is a God (language)-of-the-gaps argument: we can't figure out this rarely language, but maybe we can figure out an entirely unattested language instead, and also learn the correspondence between it and Linear A. Deep learning can predict plenty of phenomena in the world, sure, but it needs data, not aspirations. |
|
I did not say that. Human languages evolve in similar ways, use similar vocabulary, grammar etc. Linguistics has already unraveled the structure of many languages and the structure of evolution of language through time.
I am not saying DL is THE approach to take, but given that there's only ~10k characters of Linear A, it is hard to tackle the problem without common representation of multiple languages that are close to it. That's the whole point of DL, how to build better and better representations, not how to accurately model uncertainty (which is what you get by doing statistics).
I would say XLM [0] builds a common representation of a collection of languages and then works better on machine translation for languages for which the data is scarce but that are related to the languages in the model. (what it also does is discover and represent the structure of part-of-speech, grammar, entities etc. without being told about those particular things)
Does there exist an abundance of data for languages close to Linear A? If not, then I admire the work of all that try to untangle this with their brains alone.
0: https://github.com/facebookresearch/XLM