|
|
|
|
|
by Houshalter
3248 days ago
|
|
The scikit-learn flowchart recommends it for text data with less than 100k samples when linear SVC doesn't work: http://scikit-learn.org/stable/tutorial/machine_learning_map... AFAIK it's by far the fastest machine learning method and one of the only ones that can be learned "online". I.e. it can just update the model each time it gets a datapoint, and then throw it away without saving it for future training. These are nice properties if you are doing something at a very large scale or in an environment with very limited resources. And if your data happens to actually meet the naive bayes assumptions (that all the features are conditionally independent) then it's literally mathematically optimal and you can't do any better than it. It seems to work fairly well even when that isn't the case though. |
|
Naive bayes can be parallelized in ways that SGD can't, that's a whole other conversation.