| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by giancarlostoro 1 hour ago
	> For others, you had to resize it before input, which meant you were adding an image with poor resolution to start. Thats because small models like SD (Stable Diffusion) are trained on very specific resolutions, its the fancier models that are trained on higher quality, or more diverse sets of resolutions, and if you use a higher quality model to generate lower resolution images, what's actually happening is you're trimming a much bigger image and getting a chunk of it output, at least that's how it feels based on my many hours of experimenting. If I use major models and try to center a thing, I never see it in the center. :) My GPU can only handle so much.

1 comments

vunderba 1 hour ago

So traditionally, the way you’d do this (and why some UIs like automatic1111 let you configure inpainting so flexibly) is that you didn’t have to shrink the entire image.

The general idea was: you mask the area you want changed, and the model inpaints that region at full resolution. The advantage of masking, compared to plain img2img, is that you’re not sending the entire picture to the model.

With the classic setups like SD 1.5 and SDXL, you’d effectively inpaint at full resolution: take the masked area from a larger image, scale just that region to the model’s native resolution, process it at the full ~1 megapixel then scale it back and composite it into the original. This lets you add MORE detail.

Unfortunately if the OP is using hosted SD models, they might not have that granular control and thus would suffer pretty bad quality loss.

link

giancarlostoro 1 hour ago

I was kind of speaking more in general I realized, not just strictly inpainting, but yeah that makes sense, though I've had inpainting also limited by the image being too big for my GPU to handle as well. I may be using it incorrectly though, not really experimented with much of that in a while, maybe when I get a newer gaming rig.

link

vunderba 55 minutes ago

Yeah, the landscape also changes a lot as well. It’s just really hard to keep up with everything. Especially if you’re using it casually because some of the UI wrappers (the Gradio-based ones) have more obscure knobs and dials than a TI‑82 calculator.

This is the image I always think of when first introducing someone to ComfyUI or even Automatic1111.

https://imgur.com/a/G0Xlznj

link