Hacker News new | ask | show | jobs
by geezerjay 3251 days ago
> Is there some way to normalize the document length?

A basic technique is to normalize each term within a document following the term frequency-inverse document frequency statistic.

https://en.wikipedia.org/wiki/Tf%E2%80%93idf