Hacker News new | ask | show | jobs
by taylorfinley 1097 days ago
Really impressive. The abstract claims sub 2 second generation times, but the youtube demo seems to show generations taking ~6 seconds. Not that I would complain about 6 second generations, my 12gb 3060 probably takes 3-4x as long running SD1.5; perhaps they're not counting the time to load the model, just the active inference time?
1 comments

Yeah stable diffusion has a stage before generation where it transforms the text prompt into data for the model to use, called CLIP Encoding, it runs before every generation, and its probably the stage in the video where you see a spinner in place of the step bar.