| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by elyase 2722 days ago

The closest alternatives in this space would be allennlp [1], the recently released pytext [2] and spacy [3]. pytext's authors wrote some comparison on the accompanying paper [4] and this GitHub issue [5].

[1] https://github.com/allenai/allennlp

[2] https://github.com/facebookresearch/pytext

[3] https://spacy.io

[4] https://arxiv.org/pdf/1812.08729.pdf

[5] https://github.com/facebookresearch/pytext/issues/110

3 comments

mkl 2722 days ago

Do you know if any of these can be used for text prediction? (I.e. guessing what the next word/token will be.)

link

yorwba 2722 days ago

Text prediction is usually called "language modeling" in NLP. Because it's useful as a weak supervision signal to improve performance on other tasks, most of the mentioned libraries support it. However, they might not always provide complete examples, instead assuming that you know how to express the model and train it using the primitives provided by the library.

Flair: https://github.com/zalandoresearch/flair/blob/master/flair/m...

Allen NLP: https://github.com/allenai/allennlp/blob/master/allennlp/dat...

PyText: https://github.com/facebookresearch/pytext/blob/master/pytex...

spaCy seems to focus on language analysis and I couldn't find an API that'd be directly usable for text generation.

link

plagtag 2721 days ago

Flair looks really promising to me!

link

be_erik 2722 days ago

Markov chains can be used to do type ahead prediction. It's likely what the iOS uses for their predictive keyboard.

https://en.wikipedia.org/wiki/Markov_chain

link

mkl 2722 days ago

Yes, there are plenty of methods, and I have a couple implemented, but an off-the-shelf one from a cutting edge library would likely be better.

link

edraferi 2721 days ago

It's gonna be hard to get an "off the shelf" model for text prediction, because the upcoming text depends on the author, topic, and other context. You can probably find some decent pre-trained models to get started, but you'll need to customize them for your application to get good results.

link

mkl 2721 days ago

Right, I was thinking off-the-shelf in the sense of giving it a tokenised corpus and it does the rest, or it incorporates that into its existing model. Dictation software, phone keyboards, etc. do this.

link

starchand 2722 days ago

Which method would work best for email classification into 1 of 7 categories? Problem I've seen is 1 or 2 key sentences within the email can classify the message but they are usually outnumbered by generic sentences such as signatures, greetings, headers/footers etc

link

sachin18590 2721 days ago

These are all frameworks and while none of them have any signular advantage over other especially in the problem statement you are looking for, you should ideally be able to figure out what works best for you based on the classification sensitivity and training data you are working with. The problem in itself can be quite simple to extremely complex based on the above 2 factors. Spacy's pre-processing tools are quite easy to use and that combined with tool like talon should help you clean up the email correctly. Thereafter, if your email text is pretty much to the point, then any intent classification tool will work, however, if the email text is long and intents are spread across, then you will need a hierarchical layer to understand the intent hierarchies as well as an attention layer to understand which intents to focus and not lose track of in an email. At that time, you are quite far from using a generic plug and play framework and will need to exactly and quite thoroughly understand the deep learning models you are working with as well as the dataset you have and the classification you are trying to build.

link

starchand 2721 days ago

Thanks this is really helpful! I am using talon and sklearn as paragraph by paragraph intent classifier. I am classifying the whole email from the highest individual intent probability. This seems to be working well for my minimal test data (~200 sentences) but have yet to test in the wild. I will research hierarchical layer and attention layers.

link

wodenokoto 2722 days ago

Generally these generic sentences should be randomly distributed, and so their effect should be minimal.

You can randomly add them to your training set, if you feel that real world data has them randomly distributed, but you training sample is too small to capture this.

link

mongodude 2722 days ago

I like Flair for three reasons:

- Easy to use

- Developers are very active

- State-of-the-art results using approaches that are easy to understand and works well for most of text classification tasks

link