Hacker News new | ask | show | jobs
by nathan_compton 148 days ago
Natural to use LM embeddings for this.
1 comments

Yeah, convert to embedding, check if it's within a certain distance to an existing embedding and if so store it with that cluster and increment? Then check check further entries against against an average so clusters don't increase their "reach" indefinitely.