|
|
|
|
|
by jldugger
1287 days ago
|
|
Yes, this is pretty much TF-IDF for people too lazy to count the number of unique items in the corpus. Since that number should be the same (or at least close!) in both good and bad datasets, I'm not sure the extra math matters much. |
|