|
|
|
|
|
by cricketlover
753 days ago
|
|
I went through the post and I have absolutely no clue what this person is talking about. But I want to be in a place where I can understand what the person is saying. How can I reach that point? I was lost at quantized, could understand bit packing, and was even more lost when the author started talking about things like Hamming Distance. Please help me out. I want to grow my career in this direction. |
|
Then you need to understand binarization. This is a surprisingly effective trick that observes that if you have an embedding vector of, say, 1000 numbers those numbers for many models will be very small floating point numbers that are just above or below zero.
It turns out you can turn those thousand floating point numbers into one thousand single bits where each bit simply represents if the value is above or below zero... and the embedding magic mostly still works!
And instead of the usual cosine distance you can use a much faster hamming distance function to compare two vectors instead.
Once you understand embedding vectors and CLIP that should hopefully make sense.