|
|
|
|
|
by asploder
2832 days ago
|
|
I'm glad to have kept reading to the author's conclusion: > As a hybrid approach, you could produce a large number of inferred sentiments for words, and have a human annotator patiently look through them, making a list of exceptions whose sentiment should be set to 0. The downside of this is that it’s extra work; the upside is that you take the time to actually see what your data is doing. And that’s something that I think should happen more often in machine learning anyway. Couldn't agree more. Annotating ML data for quality control seems essential both for making it work, and building human trust. |
|
Making this assumption is fine in some cases (for example if you don't have training data for your domain), but if you build a classifier based on this assumption why don't you just use an off-the-shelf sentiment lexicon? Do you really need to assign a sentiment to every noun known to mankind? I doubt that this improves the classification results regardless of the bias problem.