Hacker News new | ask | show | jobs
by markovbling 4134 days ago
Definitely think I should look at using bi-grams and tri-grams

Interesting reflection on society if there are more 1-gram ways of communicating negativity than positivity e.g. I'm more inclined to say 'terrible' for something very bad while it feels more natural to say 'very good' than 'excellent'. If that makes any sense :)

1 comments

I found this paper useful for a side project I worked on a few months ago, one that made use of n-grams in a naive bayesian classifier:

http://arxiv.org/pdf/1305.6143v2.pdf

and the lead authors's github repos are:

https://github.com/vivekn/sentiment https://github.com/vivekn/sentiment-web

He's implemented 'negative bi-gram detection' (my phrasing, not his) with this function:

https://github.com/vivekn/sentiment/blob/master/info.py#L26-...

...which I found useful as a jumping off point. Good luck!