Hacker News new | ask | show | jobs
by konstruction 951 days ago
You can already use the embeddings as features (as input) to another model that is then trained only on the embedding vectors. In this sense, they are exchangeable.

This goes even further, as a model sophisticated enough to capture a probability distribution will produce embeddings that encode this distribution (to some extent) so that any two models of that kind produce "equivalent" embeddings that can be transformed into each other. This is an area of active research (in fact, I've just been to a seminar talk about that).

So the answer to the "How can we .." would be: by capturing the distribution, by making the embedding big enough and the training task difficult enough.

Examples of embeddings that are re-used are variants of word2vec, CLIP and CLAP.

As others have already mentioned: the hash analogy would be correct if you think about non-cryptographic hashes, but I doubt that this clarifies anything.