Hacker News new | ask | show | jobs
by adpreese 4643 days ago
Sure, there's value in rare words, but I don't think anything that occurs across the corpus fewer than 3 times is going to tell you anything useful. You need a certain amount just to have it be a real signal. What was the least frequent useful word in the data set, msalahi?