Hacker News new | ask | show | jobs
by syllogism 4276 days ago
Thanks for the kind compliment :). And I should probably be less shouty here.

Okay, it's not true that NB is strictly dominated by MaxEnt. But, look at the two example problems the author gave, where Naive Bayes was said to be a good choice. The parameters there definitely, definitely won't be conditionally independent. And probably you'll have enough data. Naive Bayes is a bad choice here, as it is in most other situations.

So, I think the caveats you've raised are all true...but, I still wouldn't be raising them in a class I was teaching. I think it's easy to have less useful discussion, made up of individually more true statements. I think a common problem in technical discussion is too much attention to every qualification, and every caveat.

People assume that there's some sort of proportionality between the importance of an idea/topic, and the airtime you give it. And I think they do this implicitly, in a way that's really hard to consciously over-rule. So I think it's really important to editorialise. I think a lot of technical discussion would be better off by making statements that are untrue at the edges, accepting the imprecision, and moving on.

That's why I'm dismayed to see someone's started writing a detailed tutorial on Naive Bayes, especially set to two problem domains it's a bad choice for. Even if it's composed of entirely true statements, I think its net effect is to miseducate people.