Hacker News new | ask | show | jobs
by serverhorror 4764 days ago
I think the spammers are trying to pollute the (publicly) available text corpus with markov-chains (or similiar stuff). My guess is the intention is to get ranking of search results near the top places. How? Well, it's on github (credible site) doesn't have links (not spam) so a corpus like this probably isn't spam elsewhere.

Of course "elsewhere" there are links in the text. But now the ranking is better and more likely to be found. So in the end it's a win for the spam blogs because now they get more hits/traffic.