Hacker News new | ask | show | jobs
by stormfather 360 days ago
I would try the Qwen models before LLaVa

Do you need the embeddings to be private? Or just the photos?

1 comments

For photo indexing I'd run CLIP directly and save on compute, no need to use a whole language model.