|
|
|
|
|
by mkasu
2134 days ago
|
|
NLP is not my main field but still relevant to my work because I often use models and resources from NLP as tools. I'm also personally interested in Linguistics and Languages so I follow related news, sometimes attend NLP conferences and follow people in those fields on Social Media. It is very concerning how few thought is usually put into linguistic or language characteristics when dealing with these topics. I also rarely see cultural considerations etc. Basically everything is considered as "machine learning will hopefully get this right if having enough data" which is unfortunate (ML is a great tool but the conferences are about language processing). Another big issue I noticed is that a majority of research only targets or evaluates English texts. In many cases the language is not even specified (although it is clear they use English from figures or examples). I even heard people complaining that work on non-English data is treated as too minor by many reviewers so stuff like that often just gets rejected. I think this is a really weird development for a field which centers around natural languages. |
|
It's also a bit simplified to consider it a bifurcation between "traditional" linguists and AI experts entirely ignorant of the discipline. Long before the current wave of AI started, Google liked to hire linguists and computational scientists. These teams probably do have plenty of subject matter experts, but for now they are reaping the low-hanging fruits of the suddenly-improved generic methods. As the marginal improvements are inevitably diminished, subject matter will become more salient again.
I'm a computational biologist by training, and have great appreciation for the often beautiful algorithms, many created in the 70s or 80s and allowing then-spectacular feats of tackling large datasets. Unfortunately, it isn't always obvious how to transfer that knowledge to the new way of doing things.