That would be one way to approach the lack of available data sets for languages that are not English.