| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by warangal 682 days ago
	Embeddings capture a lot of semantic information based on the training data and objective function, and can be used independently for a lot of useful tasks. I used to use embeddings from the text-encoder of CLIP model, to augment the prompt to better match corresponding images. For example given a word "building" in prompt, i would find the nearest neighbor in the embedding matrix like "concrete", "underground" etc. and substitute/append those after the corresponding word. This lead to a higher recall for most of the queries in my limited experiments!

2 comments

nostrebored 682 days ago

Yup, and you can train these in-domain contextual relationships into the embedding models.

https://www.marqo.ai/blog/generalized-contrastive-learning-f...

link

deepsquirrelnet 682 days ago

That’s a really cool idea. I’ll think about it some more, because it sounds like a feasible implementation for this. I think if you take the magnitude of any token embedding in wordllama, it might also help identify important tokens to augment. But it might work a lot better if trained on data selected for this task.

link