| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yagami_takayuki 895 days ago
	For recognizing features such as hair and skin color, which would do a better job? Machine learning with image classification? Or a vector database? I've had Weaviate return a dog given a human as input when doing an image similarity search, so I was wondering if there's some way to improve the results or whether I'm barking up the wrong tree.

4 comments

dimatura 895 days ago

For those two in particular? You'd definitely get a better result with an ML model such as a convolutional neural net. In some sense, using an image similarity query is a kind of ML model - nearest neighbor - which can work in some scenarios. But for this specifically, I'd recommend a CNN.

link

yagami_takayuki 895 days ago

Thanks will experiment with a CNN.

link

bfeynman 895 days ago

I would think you could improve your embedding space to address that issue, partially. Similarity search (as a result of some contrastive loss) definitely suffers at the tails and the OOD is pretty bad. That being said, you're more likely to have higher recall than a more classical technique.

link

yagami_takayuki 895 days ago

Thanks for your answer!

link

Kubuxu 895 days ago

You could consider something like BLIP2. There are multiple ways you could use it: embed images and match them against embeddings of text descriptors, train custom embedding of descriptors, or a classifier on top of the embedding (linear layer on top of the image embedding network).

The approaches increase in complexity. It also allows for dataset bootstrap:

Let's say you want to classify cats by breed. You could start by embedding images and text descriptors and distance-matching the embedded descriptors to the images. This gives you a dataset that might be 90% correct; you can then clean it up, which would be easier to do than manually labelling it. Based on that improved dataset, you can train a custom embedding for the labels or a classification layer on top of the image embedding network.

link

yagami_takayuki 895 days ago

thank you, will look into BLIP2.

link

m00x 895 days ago

You don't use vector databases independently, you need to input the embedding from a ML model.

For your use-case, it should be pretty simple. You could use a CNN and train it, or use YOLO, Deepface, or other face detection algos, then within the face, find the hair and find the skin.

From there you can use a vector database to get the colors that resemble other inputs, or you can use a simple CNN to classify the hair and skin to the closest label.

link

yagami_takayuki 895 days ago

Thanks! Will looks into this.

link