Hacker News new | ask | show | jobs
by onecreativenerd 5488 days ago
Here's a first pass at auto-discovering topics in that text using LDA (Latent Dirichlet Allocation) in R: http://opani.com/ryan/sarah-palin-email-topics/148845806827/

The biggest deficiency now is the stopwords list is not working 100% so mundane topics are creeping in. I'll have limited time to play with it in the next few days, so feel free to make some cool discoveries. :)