Hacker News new | ask | show | jobs
by Houshalter 3248 days ago
The scikit-learn flowchart recommends it for text data with less than 100k samples when linear SVC doesn't work: http://scikit-learn.org/stable/tutorial/machine_learning_map...

AFAIK it's by far the fastest machine learning method and one of the only ones that can be learned "online". I.e. it can just update the model each time it gets a datapoint, and then throw it away without saving it for future training. These are nice properties if you are doing something at a very large scale or in an environment with very limited resources.

And if your data happens to actually meet the naive bayes assumptions (that all the features are conditionally independent) then it's literally mathematically optimal and you can't do any better than it. It seems to work fairly well even when that isn't the case though.

1 comments

Logistic regression can easily be made online too, keep in mind! sklearn has an implementation of online gradient descent, and vowpal wabbit is also excellent at those problems.

Naive bayes can be parallelized in ways that SGD can't, that's a whole other conversation.

Gradient descent can be made online. But it's very slow and suffers from catastrophic forgetting. Typical gradient descent needs to iterate over the dataset many times, while naive Bayes only needs one pass.