| HN Mirror

I imagine only training on genuine visitors would be tricky with any traditional classification approach. Even having a 90/10% split of positive/negative training data is difficult since a lot of classifiers will just degrade to a majority vote.

Maybe a Restricted Boltzmann Machine or something similar?