|
|
|
|
|
by gibrown
3920 days ago
|
|
Interesting. This basically uses the background word2vec data for the entire Web to provide more information and help with things like disambiguation, synonyms, etc? Am I understanding that correctly? Maybe nit-picky thought, but its not clear to me that the TF-IDF part is what's doing a lot of extra lifting there. Do you know of any good evaluations between using vector space data and other methods for summarization? |
|
I've compared the summarization to others like OTS http://libots.sourceforge.net/ which I believe strictly relies on TF-IDF and it seems better and allows for context to control the summarization.
Other similar approaches might be based on Latent Semantic Analysis, Latent Semantic Indexing or LDA.