Show HN: Naive Bayes classifier for text categorization in five steps

Y	Hacker News new \| ask \| show \| jobs

	Show HN: Naive Bayes classifier for text categorization in five steps (towardsdatascience.com)
	7 points by gchavez2 2665 days ago

3 comments

jgrahamc 2665 days ago

This is not a bad explanation but when doing this practically it can be useful to take log() of the probabilities so that you work with sums of logs rather than multiplying small floats.

http://getpopfile.org/docs/faq:bayesandlogs

link

gchavez2 2664 days ago

Thank you for the insight John, I have included your remark on the article.

link

ColinWright 2665 days ago

From the article:

    For an English spam classifier that
    considers all the words in the English
    language, the number of the words (n)
    is approximately 171,476.

That's a remarkably precise number to be preceded by the word "approximately".

link

gchavez2 2664 days ago

Agree, that was odd, it now reads:

"the number of the words (n) is approximately 170k"

Thank you for the remark.

link

atum47 2662 days ago

Nice article, very glad to read it. Keep up the good work.

link

gchavez2 2659 days ago

Thank you Victor, I enjoyed your JS articles too!

link