| HN Mirror

I've compared rspamd on random stream of user's messages with SA and Kaspersky antispam several years ago. And I've got almost the same rate of false positives and false negatives for all three products. However, over years spammers are getting much smarter (images spam, valid DKIM, valid SPF and other clever tricks).

Regarding statistics, rspamd uses OSBF-Bayes classifier and 5-gramms input (so it is not naive bayes). I've used the following academic paper: http://osbf-lua.luaforge.net/papers/osbf-eddc.pdf as reference. This algorithm is also used for crm114 spam classifier. However, bayes classifier is a very small part of rspamd (unlike dspamd, for example) and it could be almost useless if you have, let's say, 50 millions of users accounts. Rspamd is targeted for this grade systems.