How we got Stable Diffusion XL inference to under 2 seconds

Y	Hacker News new \| ask \| show \| jobs

	How we got Stable Diffusion XL inference to under 2 seconds (baseten.co)
	51 points by varunshenoy 1022 days ago

4 comments

cwillu 1022 days ago

Playing around with cfg technique, I'm finding that turning off guidance at the 40% mark causes requested fine details to not appear in the final image. This sorta implies that switching cfg midway and/or switching prompt vectors might be interesting from a prompting standpoint, but it kinda kills it as a performance optimization.

link

cwillu 1020 days ago

https://imgur.com/a/47D6MEl demonstrates, prompt is “Praying Mantis, looking out a living room window, (hyperrealism:1.2), (8K UHD:1.2), (photorealistic:1.2), shot with Canon EOS 5D Mark IV, detailed bug, macro” with 50 steps, cutting off cfg at 20 steps vs using cfg all the way through in the normal fashon.

link

Tenoke 1022 days ago

It's a bit weird to talk about steps but not about the sampler (20 steps with Euler vs 20 steps in DPM+2M Karras are pretty different beasts both in terms of speed and quality).

I also see compiling but no AITemplate, which seems to be the among the hottest way to speed-up SD recently.

link

yieldcrv 1021 days ago

This could save alot of money on Replicate.ai

Especially if you are charging your users the same 1,000% markup while your own costs have been cut into 1/3rd and deliver results faster

link

gmerc 1022 days ago

I don’t know man, out of the box on SD-Next it’s about 3-4 secs for a picture at 1024 with UniPC and 20 steps on a 4090

link