|
|
|
|
|
by valzevul
18 days ago
|
|
Hi, OP here! TF-IDF was the first thing I tried - it works great for stopwords but it doesn't handle cross-language bleed of filler words well, and the short life-event messages ("he died", etc) use common words and get aggressively down-weighted. I had some asymmetry analysis when looking at directional sentiment and per-person question rates - that's fun indeed! I also went with the Jaccard convergence and the endearment categories instead of wordclouds, so that I could see how word choices are changing across time. |
|