Hacker News new | ask | show | jobs
by mkasu 2130 days ago
Yes, the seeming performance of (especially) neural models compared to traditional models is probably the main factor. Although, some voices[1] argue that traditional or much simpler approaches still often do a similar job compared to super over-engineered models, especially when going even slightly beyond an existing target-dataset or task.

I'd argue, that improving the ML models is really the job of ML researchers and should be mainly targeting ML conferences like AAAI (Adv. of AI). In other conferences (directly targeting NLP, CV, Comp. Biology, etc.) it should be the main job to combine those models with the domain-specific characteristics (e.g., language information for NLP) or "traditional" methods to make it an interesting discussion.

I was recently doing reviewing for a multimedia conference and quite a lot of the papers I reviewed were basically pure ML papers. A colleague had the same experience.

1: https://arxiv.org/abs/1907.06902

1 comments

The ML papers wouldn't bother me if they included specialists of the targeted domain to address the obvious pitfall. I've analyzed the figures in the blog post and skimmed the paper and both one novelty claim ((2) A single massively multilingual model spanning 109 languages and showing cross-lingual transfer even to zeroshot cases.) and an "explanation" (Such positive language transfer across languages is only possible due to the massively multilingual nature of LaBSE) can be debunked just by looking carefully at the figures like I did in the past hour. The languages on which they test the things are also poorly selected (6 constructed languages, one duplicate and one macro-lang) which shows clear lack of attention to details and poor understanding of some basic linguistics notions. But hey it's an ML paper, it's from Google and it has BERT in the title so get attention and will be cited even if it's half-crap.