|
|
|
|
|
by refulgentis
641 days ago
|
|
There's a sort of regular repeating confusion with embeddings that they're very well behaved in visual dimensions. IMHO it's a category error that results from tutorials using the king + female = queen example (which, funnily enough, wasn't even true for the original word2vec, if commentary I've read previously here is correct). Working with them a lot has me picture them more as "a multivariate function that outputs 768 numbers, and was learned by brute force" than "something that sees in 768 dimensions" --- of course, they're both true, but the second interpretation shades more than it illuminates once you're past the very first interrogatory of "so what is this calculating, exactly?" |
|
More clearly, symmetric retrieval is easier to visualize in a dimensionality reduced space than asymmetric retrieval.
I suspect that some form of multi vector document embedding would be more understandable in the reduced space than this single vector representation.