Hacker News new | ask | show | jobs
by bravura 1775 days ago
Is anyone aware of classification (e.g. word prediction) datasets for low-resource and endangered languages?

If so, we would like to use it for the HEAR NeurIPS competition: https://github.com/microsoft/DNS-Challenge/tree/master/datas...

The challenge is restricted only to classification tasks, and sequence modeling like full ASR is unfortunately beyond the scope of the competition.