| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jph00 2959 days ago

Of course not. The use of pre-training on a large unlabeled corpus and subsequent fine-tuning is what the paper is about. It is stated repeatedly in the paper and the post.

It is totally correct and in no way misleading to say we need only 100 labeled examples. Anyone can get similar results on their own datasets without even needing to train their own wikitext model, since we've made the pre-trained model available.

(BTW, I see you work at a company that sells something that claims to "categorize SKUs to a standard taxonomy using neural networks." This seems like something you maybe could have mentioned.)

1 comments

ramanan 2959 days ago

Got it. I was looking for input on how generalizable (the ability of weights to change/adapt) when the training labeled data is 100x smaller than the initial pre-training dataset?

Also, I don't understand the need to be so defensive though and the relevance between my employer and my post?

link

JPKab 2959 days ago

When you use the word disingenuous, you invited the response you got. Totally uncalled for to write that.

His response on your employer was likely driven by an assumption that you viewed this as free, open source competition to your product, and thus the negative comment.

To the OP:. I've find a lot of NLP, and this is phenomenal work.

link