Hacker News new | ask | show | jobs
by nerdright 590 days ago
Great post indeed! I totally agree that embeddings are underrated. I feel like the "information retrieval/discovery" world is stuck using spears (i.e., term/keyword-based discovery) instead of embracing the modern tools (i.e., semantic-based discovery).

The other day I found myself trying to figure out some common themes across a bunch of comments I was looking at. I felt lazy to go through all of them so I turned my attention to the "Sentence Transformers" lib. I converted each comment into a vector embedding, applied k-means clustering on these embeddings, then gave each cluster to ChatGPT to summarize the corresponding comments. I have to admit, it was fun doing this and saved me lots of time!

1 comments

Interesting approach. Did you tell GPT to summarise the comments of each cluster after grouping them?