Hacker News new | ask | show | jobs
by s3nh_ 2481 days ago
the hardest part in training model in foreign languages is to get correctly labeled dataset. I worked with pretrain model on Polish language documents and based on this experience it is relatively good if you are using some text similarity measures. There are some examples/pretrain models with Korean/English/French language