Hacker News new | ask | show | jobs
by bonoboTP 426 days ago
I don't believe video generation can make nonverbal communication sync up so well, regarding the shrug, eye movement, facial expression etc. perfectly synced with the voice. As I said, I think it's conditioned on some real footage, somewhat like ControlNet perhaps.