Hacker News new | ask | show | jobs
by jtr1 3382 days ago
The title of this piece doesn't adequately reflect the fascinating methodology it contains. The 538 team uses latent sentiment analysis to create a kind of algebra for subreddits, i.e. r/running + r/weightlifting = r/fitness. Politics aside, it's (IMHO) well worth the 15 minutes it takes to read. I'd love to see HN readers more experienced with the methodology take it to task and see what shakes out.
3 comments

Agreed, I submitted the same link with the title "Semantic Analysis of Donald Trump Sub-Reddits," but even that leaves out the fascinating idea of "subbreddit algebra" that the article goes into. Articles like these are why I love 538. It's like popular science for statistical analysis.
Me too, too!

Not only is this a really cool and novel (as far as I know) use of machine learning techniques, there's a lengthy footnote that goes into some detail about the method. And the presentation is great, very slick modern HTML.

So it seems like an excellent fit for HN, apart from the title which will unfortunately put a lot of people off.

This is a really clever application of this sort of model. I did a lot of work with LDA (latent Dirichlet allocation) topic modeling, but never consider doing binary operations on the topic vectors themselves. Cool idea.
You are probably right. I just used the authors title. I also found it fascinating and the scroll effects were a nice bonus :)