Hacker News new | ask | show | jobs
by hobarrera 4048 days ago
> This information is merely interesting for some academic researches

It most definitely is not. It's the most important factor when choosing a spam filter.

False positives are extremely harmful (it can result in loss of communication, which is what you want to avoid the most). A significant amount of false positives is what would make the difference between useful or useless.

Nobody want to tell their users "check your spam mailbox, (the one with dozens of spam messages) for ham every once in a while)".

1 comments

As I see it, unless you can guarantee that you give zero false positives (which, knowing how certain users compose their mail, is arguably impossible) you still have to do it.

Also I suppose that the false positive/negative rate can only be given on a well defined corpus, I'm not sure there is one that is a good representation of the current and future spam trends, so in the end giving those numbers could be very misleading.