Hi, I've built the demo. Unfortunately, it's running on a single GPU and can only be used concurrently by a few users. For a better real-time experience, you would need a dedicated machine.
FYI, this is made possible due to a new technique: https://latent-consistency-models.github.io, fine-tuning an existing models. The author will soon publish the training script. We'll see all the cool image models running at this speed! I'm excited for this! I can see many interesting experiments and projects emerging from this.