| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by picometer 909 days ago

In hindsight, reviewer f5bf’s comment is fascinating:

> - It would be interesting if the authors could say something about how these models deal with intransitive semantic similarities, e.g., with the similarities between 'river', 'bank', and 'bailout'. People like Tversky have advocated against the use of semantic-space models like NLMs because they cannot appropriately model intransitive similarities.

What I’ve noticed in the latest models (GPT, image diffusion models, etc) is an ability to play with words when there’s a double meaning. This struck me as something that used to be very human, but is now in the toolbox of generative models. (Most of which, I assume, use something akin word2vec for deriving embedding vectors from prompts.)

Is the word2vec ambiguity contributing to the wordplay ability? I don’t know, but it points to a “feature vs bug” situation where such an ambiguity is a feature for creative purposes, but a bug if you want to model semantic space as a strict vector space.

My interpretation here is that the word/prompt embeddings in current models are so huge that they’re overloaded with redundant dimensions, such that it wouldn’t satisfy any mathematical formalism (eg of well-behaved vector spaces) at all.

2 comments

PeterisP 908 days ago

The key difference is what I'd call "context-free embeddings" vs "contextual embeddings". Due to its structure, word2vec and similar solutions have to assign every single "bank" in every sentence the exact same vector, but later models (e.g. all the transformer models, BERT, GPT, etc) will assign wildly different vectors to "bank" depending on the context of surrounding words for that particular mention of "bank".

link

intalentive 909 days ago

Even small models (e.g. hidden dims = 32) should be able to handle token ambiguity with attention. The information is not so much in the token itself as in the context.

link