|
|
|
|
|
by bowlesbe
3507 days ago
|
|
Great point! I considering using fasttext as a baseline, however in practice fasttext really didn't work well at all with the small data set, much worse than the tfidf baseline. I think Fasttext's classification approach might not work well with such a small dataset. I'm not sure but I suspect its because it tries to learn embeddings - but there just isn't anywhere near enough data for that. I'd love an outside perspective on this. |
|
But I'm wondering how you get around that with the neural net. In the post, you said there are only a few hundred labeled examples, right? How can a neural net with hundreds of parameters set those parameters to anything reasonable, and not overfit, when there are about as many parameters as examples?