| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by andsens 1387 days ago
	Uhm. You’re basically asking how the entire NN works. There is no easy explanation for that.

1 comments

alok-g 1387 days ago

I understand neural networks, embeddings, convolutions, etc. The part that's unclear to me is specifically how textual embeddings are linked into the img-to-img network trying to reduce the noise. In other words, am missing how the process is 'conditioned upon' the text. (I lack a understanding the same for conditional GANs as well.)

If the answer is just that the textual embeddings are also fed as simple inputs to the network, I already understand then.

link

capableweb 1387 days ago

Might be worth looking through the dataset it was trained on, here's on example: https://laion-aesthetic.datasette.io/laion-aesthetic-6pls/im...

So the model understands (kinda) who Bob Moog is, so when you include "Bob Moog" in the prompt, the model knows what you are looking for.

link

ShamelessC 1385 days ago

Why did they unnecassarily re-index a smaller subset of Laion Aesthetic? You can search _all_ of laion using the pre-built faiss indices from laion..

https://rom1504.github.io/clip-retrieval/?back=https%3A%2F%2...

is a hosted version, but you can download and host it yourself as well.

link