|
|
|
|
|
by wrath
4130 days ago
|
|
1. You can try using bi-grams or even tri-grams to make you word list a little more precise. 2. Create a validation by manually identifying each review as positive or negative. Each time you modify your algorithm run it through your validation set and note the results in a spreadsheet. If you don't do that, you'll never know if and how you've improved the results. The bigger the validation set the better. Similarly, you can use part of your validation set as a training set into a classifier. 3. Find a scale that works to bias your score. For example, I would try to bias your negative score using a log scale. The fewer negative words you have the more they are worth, the more you have the less they are worth. |
|
Interesting reflection on society if there are more 1-gram ways of communicating negativity than positivity e.g. I'm more inclined to say 'terrible' for something very bad while it feels more natural to say 'very good' than 'excellent'. If that makes any sense :)