Hacker News new | ask | show | jobs
by vantubbe 790 days ago
How I did it in 2 steps:

1. Summarization - Though short, microblogs still have lots of non essential info (This paragraph could be a sentence). Summarization gets rid of this.

2. Semantic Similarity Grouping - So many headlines and microblogs are essentially saying the same thing. I only need to see it once.

Would love any feedback.