|
|
|
|
|
by AznHisoka
4658 days ago
|
|
For those people who are interested in topic/article classification and NLP, Twitter can a gold-mine, especially hashtags. If you gather the hashtags for a million articles, you pretty much have a co-location database. Now you can mine that data and see which hashtags are common if you have "Google Panda" in your title for instance, or which hash tags are commonly used with #seo. Hashtags are basically structured semantic data, if you look at them in aggregation. A good tool for doing this is SOLR or ElasticSearch. Simply import all the hashtags for a bunch of articles to the index, and do a faceted search for a specific hashtag, or keyword, and you'll get the top 10 associated hashtags that are highly related to that keyword. |
|