| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by btw0 4845 days ago
	I've built an anti-spam system for Delicious.com using Naive Bayes classifier with a really huge feature database, think tens of millions, mostly tokens in different parts of the page, those features are given different weights which contribute to the final probability aggregation. The result was similar to what the OP achieved - around 80% accuracy. The piece of work was really interesting and satisfying.

1 comments

Hmm, interesting ... but how you calculate the weights ? do you use the KL-divergence method.