Hacker News new | ask | show | jobs
by authorfly 543 days ago
Probably a multilingual version will be needed, like with BERT and RoBERTa. I should hasten to add for multi language tasks(beyond detection), either simpler methods for tasks like multiple language classification/prediction(e.g. word frequency, BERTopic like approaches or SVMs) or LLMs are generally a better candidate.

There are a couple of reasons.. 1) That size (even for the large) is too much for multiple languages with good BLEU scores. 2) Encoder and decoder models don't tend to get trained for translation as much as e.g. GPT models with large translation texts in their datasets across multiple languages (with exceptions such as T5 translation task).

1 comments

Looking to do super fast embeddings, basically. A few chinese teams seem to have produced some BERT variants so I’ll look there.