| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jayalammar 1043 days ago
	This is my sense as well. Text generation LLMs haven't been the best source of embeddings for other downstream use cases. If you're optimizing for token embeddings (e.g., for NER, span detection, or token classification tasks), then a token training objective is important. If you need text-level embeddings (e.g., for semantic search or text classification), then that training objective is required (e.g., what Sentence BERT did to optimize BERT embeddings for semantic search). That's a great list of existing embeddings models (in addition the SentenceBERT models https://www.sbert.net/docs/pretrained_models.html).

1 comments

readyplayeremma 1043 days ago

The SGPT model is a very high performing text embeddings model adapted from a decoder. Using the same techniques with Llama-2 might perform better than you expect. I think someone will need to try these things before we know for certain. I believe there is still room for significant improvement with embedding models.

link