Hacker News new | ask | show | jobs
by LunaSea 1493 days ago
Is this still true in an era where most NLP problems use language models as a solution?
2 comments

I think so. First of all, knowing some linguistics will teach you terms and concepts (e.g. parse tree, phrase, morpheme, phoneme, etc) that will both help you find relevant literature and avoid reinventing terms for stuff that is widely known (so others will more readily find your work).

Language models are currently the best solution for many problems, but it's hard to predict how we will move forward from here. Maybe the inclusion of linguistic information, or linguistic-inspired knowledge, or whatever, will be the key to having better results, or saving training time/resources. With no linguistics background, I imagine it's hard to get ideas going in that direction (and test if it's actually a good direction)

I agree. I think having linguistics knowledge can help especially in applied situations. Linguistics knowledge can help create fallback systems when an ML system fails, or help build rules to amplify or dampen the confidence of a response from an ML system, or aid in the engineering of a system (all that comes before or after the ML blackbox).

Sort of like an algorithmic trader knowing market microstructure intimately (versus only pure statistics).

Language models as a solution to what problems?

Yes, you can easily use AutoModel.from_pretrained('bert-base-uncased') to convert some text into a vector of floats. What then?

What are the properties of downstream (aka actually useful) datasets that might make few-shot transfer difficult or easy? How much data do your users need to provide to get a useful classifier/tagger/etc. for their problem domain?

Why do seemingly-minor perturbations like typos or concating a few numbers result in major differences in representations, and how do you detect/test/mitigate this to ensure model behavior doesn't result weird downstream system behavior?

How do you train a dialog system to map 'I'm good, thanks' to 'no'? How do you train a sentiment classifier learn from contextual/pragmatic cues rather than purely lexical ones (example: 'I hate to say it but this product solves all my problems.' - positive or negative sentiment?)

How bad is the user experience of your Arabic-speaking customers compared to that of your English-speaking customers, and what can you do to measure this and fix it?

My linguistics background really helps me think through a lot of these 'applied' NLP problems. Knowing how to make matmuls fast on GPUs and knowing exactly how multihead self-attention works is definitely useful too, but that's only one piece of building systems with NLP components.

> My linguistics background really helps me think through a lot of these 'applied' NLP problems.

There many benchmarks where LMs absolutely outperform mechanical linguistics solutions.

Do you have success stories when there is significant outperforming solution in opposite direction?

There's no competition between linguistics and ML/NLP, they have completely different goals as fields.

I meant that my linguistics background helps me understand & solve problems: studying linguistic field work has helped me design crowd labeling jobs, knowing about morphology helps me understand why BPE tokenizers work so well (and when they might not), knowing about syntax/dominant word order makes me think that multilingual Bert should probably do something more intelligent with positional embeddings, methods from psycholinguistics are useful for understanding entropy/surprisal wrt LM next-word probabilities... just a few examples but the list could go on.