Hacker News new | ask | show | jobs
by Eridrus 3512 days ago
> FastText(.zip) continues to be a weird project.

What is weird to you about the project? I haven't looked at the details, but the motivation seems pretty obviously to be able to run deep learning models on people's phones without seriously impacting UX.

Hell, even running a large vocabulary model on a server can be annoying when these models take ~10GB to just store the word vectors.

1 comments

Well, it isn't deep learning for one thing.

Basically it's a reappraisal of early 2000-style manually engineered features. It's good work, but doesn't add much over VopalWabbit.

I haven't read the .zip paper in depth, but the mobile angle doesn't seem convincing to me. Text models generally just aren't that big! Drop the number of dimensions in W2V and it's really pretty small, and still expressive.

Don't get me wrong - I like FastText. But it suprises me it remains a research direction - almost everyone else is working on trying other approaches to get an AlexNet like breakthrough on NLP tasks. It's pretty clear that breakthrough won't come from the FastText approach.

> Well, it isn't deep learning for one thing.

You know, I didn't actually realise that. I had only glanced at it and assumed they were applying these ideas to deep models.

> Drop the number of dimensions in W2V and it's really pretty small, and still expressive.

I don't think it's crazy to want to be able to do get better performance with small memory targets though.

They're working on other directions too, but maybe this is useful for their product groups.

What do you think are the more promising approaches? If you could link to some papers, I'd love to read some of them.