|
|
|
|
|
by DougBTX
879 days ago
|
|
> Always felt they're more like hashes/fingerprints for the RAG use cases. Yes, I see where you’re coming from. Perceptual hashes[0] are pretty similar, the key is that similar documents should have similar embeddings (unlike cryptographic hashes, where a single bit flip should produce a completely different hash). Nice embeddings encode information spatially, a classic example of embedding arithmetic is: king - man + woman = queen[1]. “Concept Sliders” is a cool application of this to image generation [2]. Personally I’ve not had _too_ much trouble with running out of RAM due to embeddings themselves, but I did spend a fair amount of time last week profiling memory usage to make sure I didn’t run out in prod, so it is on my mind! [0] https://en.m.wikipedia.org/wiki/Perceptual_hashing [1] https://www.technologyreview.com/2015/09/17/166211/king-man-... [2] https://github.com/rohitgandikota/sliders |
|