Not a single mesoamerican language is present. Maya, Náhuatl, Otomí, Zapoteco, etc.
And these languages are big, they are spoken by millions and even have literature. Náhuatl and Maya are spoken in Central America.
Are there online corpora, like Wikipedia, that could be used to train the models? Are those under a permissive enough license to be used for model training?
If there are spoken, with enough budget, a library of voices could be recorded. I think you’d prefer that collection to be gathered and maintained by a non-profit rather than Meta.
If there are spoken, with enough budget, a library of voices could be recorded. I think you’d prefer that collection to be gathered and maintained by a non-profit rather than Meta.