Hacker News new | ask | show | jobs
by pokernaming 2407 days ago
> That probability is n_word/total_words the corpus.

In this case, wouldn't it actually be better to just drop the denominator, because it will be the same for both (spam & not spam).