Hacker News new | ask | show | jobs
by knuthsat 1631 days ago
It's just amazing how DL made most work completely irrelevant.

Stuff before 2010 in natural language processing is ridiculous. Dynamic programming algorithms, beam search, dependency parsing (grammar) algorithms (going from O(n^3) to O(n) with cost-sensitive algorithms), a huge focus on lexical analysis, part-of-speech, graphical models (maximum entropy, conditional random fields, etc.).

Today all of these algorithms are completely irrelevant. No one needs part-of-speech anymore, or dependency (grammar) trees, or cost-sensitive reinforcement learning reductions.

I remember being so inspired by all of the work and learned a lot, but it's quite funny how Lindy works.

4 comments

It felt like that 5-7 yrs ago. But now, we have 1,000s of variation of Transformers, Diffusion Models, Energy Models, Patches, Sequence models for Reinforcement Learning (compete with beam search inside!), GNNs, and others to choose from. I feel that all of the Linguistics is likely to come right back and get integrated into the various Deep Learning frameworks.
I can't help feeling that research prior to deep learning was more rigorous and impressive though. When I read papers from before they tend to be filled with statistical modeling and proofs and were somewhat intimidating. Now it seems like it's a lot of "oh we made this model and it works".
You are partly wrong. Graphical models are orthogonal to DL.

We are starting to learn how to mix both. E.g HMM + DL = Deep Markov model. It has the advantages of both, structure and large numbers of parameters.

Some SOTA NLP models follow this approach.

I wouldn't say they're irrelevant. POS tagging and dependency parsing are still useful for data science type applications, if not for full-scale natural language understanding. Maximum entropy and log-linear models aren't gone, they've just reappeared in a new form where the features come from neural nets.