Hacker News new | ask | show | jobs
by onewland 6062 days ago
I think this is barfing on a lack of punctuation.

"this is the most shittest game online ever full of little nooby kids" (taken from a youtube comment) is "not likely stupid". Add a period at the end, though, and it works.

Maybe the parsing system isn't designed to handle incomplete sentences or sentence fragments? That could be difficult but key; many of the bottom of the barrel posts on the internet have no punctuation at all.

Update: Starting to really take an interest in this. While there are false negatives, I can't seem to find any false positives.

1 comments

Unrelatedly, one of the challenges in analyzing modern text with traditional NLP tools is that the tools usually expect standard English, whereas the text is rather colloquial, and the punctuation is for timing purposes, when present at all.