|
|
|
|
|
by osipov
6635 days ago
|
|
i think you are looking for document classification algorithms: http://en.wikipedia.org/wiki/Document_classification the current state of the art algorithms are based on support vector machines, but their learning part could be tricky to implement in a scalable fashion. if you are looking for a quick and dirty approach, TFIDF algorithm (it is a naive "naive Bayes" :) is simple and is adequate for many applications |
|