|
|
|
|
|
by minimaxir
974 days ago
|
|
That's a property of Word2Vec specifically due to how it's trained (a shallow network where most of the "logic" would be contained within the embeddings themselves). Using it for embeddings generated from LLMs or Embedding layers will not give as fun results; in practice the only thing you can do is average or cluster them. |
|
Is it though? I thought the LLM-based embeddings are even more fun for this, as you have many more interesting directions to move in. I.e. not just:
emb("king") - emb("man") + emb("woman") = emb("queen")
But also e.g.:
emb(<insert a couple paragraph long positive book review>) + av(sad) + bv(short) - c*v(positive) = emb(<a single paragraph, negative and depressing review>)
Where a, b, c are some constants to tweak, and v(X) is a vector for quality X, which you can get by embedding a bunch of texts expressing the quality X and averaging them out (or doing some other dimensional reduction trickery).
I've suggested this on HN some time ago, but only been told that I'm confused and the idea is not even wrong. But then, there was this talk on some AI conference recently[0], where the speaker demonstrated exactly this kind of latent space translations of text in a language model.
--
[0] - https://www.youtube.com/watch?v=veShHxQYPzo&t=13980s - "The Hidden Life of Embeddings", by Linus Lee from Notion.