Hacker News new | ask | show | jobs
by SomewhatLikely 3631 days ago
Specifically the bag of n-grams can be viewed as a very sparse vector with non-zero entries corresponding to the n-grams in the bag. As a result, n-grams not seen during training need to be ignored.