Hacker News new | ask | show | jobs
by p-e-w 140 days ago
They probably have a trillion emails with human labels, either from users directly applying them, or inferrable from actions like deleting.

With that much data, even a simple Bayesian classifier should work pretty much perfectly.