i worked on the gpu/infra side of this, so feel free to AMA
Ultimately the LCM is just a SD Unet trained with a new objective, so a lot of SD optimizations are transferable to LCMs
there's a very cool consistent hashing and probability stuff going on for the GPU routing logic but again, i'll write it after this fire ends similar to a post-mortem.
there's a very cool consistent hashing and probability stuff going on for the GPU routing logic but again, i'll write it after this fire ends similar to a post-mortem.