|
|
|
|
|
by steve1820
2099 days ago
|
|
Vaporware is better than going nowhere! (Get it...noware...haha). Congrats on getting started. I agree with Obsidian - I think that most people forget the maintenance time it takes to build a lifelong Knowledge Management System. I like your idea - document similarity is a well known area in ML. Feel free to take my Chrome Extension and use the parts where it tracks key paragraphs in an article (using a user's click/ hover/ attention behaviour) and use that as the corpus for your ML similarity models. Intuitively it makes more sense to run document similarity on key points/ paragraphs than the whole web page. If you want the whole web page though, there's code in the Chrome Extension that use's Mozilla's readability lib (https://github.com/mozilla/readability) to purify the web content. |
|