| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by frenchmajesty 281 days ago

OP here. I agree! I should've called out why I did _not_ follow that approach as many others have commented the same.

The main reason why is that I needed the classification to be ongoing. My system pulled over thousands of tweets per day and they all needed to be classified as they came for some downstream tasks.

Thus, I couldn't embed all tweets, then cluster, then ...

2 comments

bungalowmunch 279 days ago

Do the labels need to be static once the system has started? If not would be interesting to relabel embedding clusters once each hits a certain critical mass of tweets, or do so somewhat continuously.

link

pietz 281 days ago

Makes sense, I appreciate the comment. Well written article. Subscribed.

link