| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by b_ttercup 3248 days ago
	Is Naive Bayes really ever the most practical choice? Yes it is a simple, fast algorithm, but it's usually a non trivial step below other simple models in my experience and doesn't seem to show any major advantages. The results shown here seem good but bag of words models usually do better than you might think on supervised NLP. So what's the motivation?

2 comments

Houshalter 3248 days ago

The scikit-learn flowchart recommends it for text data with less than 100k samples when linear SVC doesn't work: http://scikit-learn.org/stable/tutorial/machine_learning_map...

AFAIK it's by far the fastest machine learning method and one of the only ones that can be learned "online". I.e. it can just update the model each time it gets a datapoint, and then throw it away without saving it for future training. These are nice properties if you are doing something at a very large scale or in an environment with very limited resources.

And if your data happens to actually meet the naive bayes assumptions (that all the features are conditionally independent) then it's literally mathematically optimal and you can't do any better than it. It seems to work fairly well even when that isn't the case though.

link

phunge 3247 days ago

Logistic regression can easily be made online too, keep in mind! sklearn has an implementation of online gradient descent, and vowpal wabbit is also excellent at those problems.

Naive bayes can be parallelized in ways that SGD can't, that's a whole other conversation.

link

Houshalter 3247 days ago

Gradient descent can be made online. But it's very slow and suffers from catastrophic forgetting. Typical gradient descent needs to iterate over the dataset many times, while naive Bayes only needs one pass.

link

tyingq 3248 days ago

I thought it was the typical approach for identifying email spam. Has that changed?

link