Hacker News new | ask | show | jobs
by xiamx 3357 days ago
Curious to know which algorithm and the sourcing of corpus you chose to train your system on.

On a similar note, here a curation of great sentiment analysis methods and implementations: https://github.com/xiamx/awesome-sentiment-analysis

1 comments

Thanks a lot for the link. I've used the technology called Paragraph Vectors https://cs.stanford.edu/~quocle/paragraph_vector.pdf for sentiment features extraction. Training collections were created in a semi-automatic mode and included news title+short description gathered from popular RSS feeds.
Sentence vectors encode the data, but how do you determine if a story is positive or negative?
After collection of possible positive/negative features with weights I used logistic regression classifier with some modifications (e.g. position algorithm) to classify the article. It determines the article polarity based on features (words, phrases and etc.)