Playing around with cfg technique, I'm finding that turning off guidance at the 40% mark causes requested fine details to not appear in the final image. This sorta implies that switching cfg midway and/or switching prompt vectors might be interesting from a prompting standpoint, but it kinda kills it as a performance optimization.
https://imgur.com/a/47D6MEl demonstrates, prompt is “Praying Mantis, looking out a living room window, (hyperrealism:1.2), (8K UHD:1.2), (photorealistic:1.2), shot with Canon EOS 5D Mark IV, detailed bug, macro” with 50 steps, cutting off cfg at 20 steps vs using cfg all the way through in the normal fashon.
It's a bit weird to talk about steps but not about the sampler (20 steps with Euler vs 20 steps in DPM+2M Karras are pretty different beasts both in terms of speed and quality).
I also see compiling but no AITemplate, which seems to be the among the hottest way to speed-up SD recently.