| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bjourne 1047 days ago
	Isn't this product kind of impossible? Like a compression program that compresses compressed files? If you have an algorithm for determining whether a generated image is good or bad couldn't the same logic be incorporated into the network so that it doesn't generate bad images?

3 comments

joefourier 1047 days ago

Not impossible at all - classifier networks are much, much easier to train than generative networks. However you can’t directly integrate the logic into the generator, you’d have to train the generator against the discriminator network. This is essentially the principle of a GAN and although many tricks have been developed in recent years, they tend to be finicky and difficult to train.

Diffusion models like SD are trained with a very simple loss function instead, which is just the L2 loss of an iterative denoising process. This tends to result in stabler training than using GANs. However, you could fine tune SD with reinforcement learning using the deformity detector as the reward, but it’s not a panacea as it could lead to overfitting and performance degradation.

link

bjourne 1047 days ago

> Not impossible at all - classifier networks are much, much easier to train than generative networks. However you can’t directly integrate the logic into the generator, you’d have to train the generator against the discriminator network.

Generative networks are ime not at all difficult to train because the amount of training data is typically orders of magnitudes larger. In this case, the idea is to train something to classify images as high or low quality, which I think is just as hard as generating images. Regardless, if you had such logic, I don't see why you couldn't incorporate that into the network's own loss function? That's how it is done for L1 and L2 regularization and many other techniques for "tempering" the training process.

The problem is that you want the model to be creative but not "too creative" (e.g eight finger hands). But preventing it from being too creative risks making it boring and bland (e.g only generating stock images). I don't think you can solve that with a post-processing filter. Generating say 100 images and picking the "best" one might just be the same as picking the most bland one.

link

__loam 1047 days ago

That's essentially how using a GAN works.

E: or how it's supposed to work.

link

thumbuddy 1046 days ago

Kind of phenomological but both parts of the GAN are the same model.

link

darren_hsu 1047 days ago

We’re optimistic about using our own algorithms and models to evaluate another model. In theoretical computer science, it is easier to verify a correct solution than to generate a correct solution (P vs NP problem).

link

bagels 1047 days ago

I don't think p vs np has anything to do with it, but also, I don't think your maxim is always (but maybe often is) true anyways.

Problem: traveling salesman, solution: one particular path. I think verification that the solution is optimal in this case is exactly the same problem as finding the solution.

link

vore 1047 days ago

Not to nitpick but this is NOT the right takeaway from P vs NP.

link

lancesells 1047 days ago

Do you, or will you, use human labor in any instance on evaluating images?

link

darren_hsu 1047 days ago

We currently don't since its not scalable

link