Hacker News new | ask | show | jobs
by evrydayhustling 1376 days ago
Can confirm. We use sentence-level transformer embeddings for (vector) search, clustering, and classification tasks. As an old school ML guy I've been amazed at how robust they are to typos, slang, punctuation, etc.

However, I'm sure there are still applications where you don't have access to a robust embedding for your domain but can apply other techniques to deal with that domain's noise.

1 comments

Here is decent intro to sentence level transformers & embeddings:

https://www.pinecone.io/learn/sentence-embeddings/