| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alok-g 1387 days ago
	I understand neural networks, embeddings, convolutions, etc. The part that's unclear to me is specifically how textual embeddings are linked into the img-to-img network trying to reduce the noise. In other words, am missing how the process is 'conditioned upon' the text. (I lack a understanding the same for conditional GANs as well.) If the answer is just that the textual embeddings are also fed as simple inputs to the network, I already understand then.

1 comments

Might be worth looking through the dataset it was trained on, here's on example: https://laion-aesthetic.datasette.io/laion-aesthetic-6pls/im...

So the model understands (kinda) who Bob Moog is, so when you include "Bob Moog" in the prompt, the model knows what you are looking for.

Why did they unnecassarily re-index a smaller subset of Laion Aesthetic? You can search _all_ of laion using the pre-built faiss indices from laion..

is a hosted version, but you can download and host it yourself as well.