Hacker News new | ask | show | jobs
by 0xx 1853 days ago
Founder here. AMA :)

To answer a few recurring questions in the thread

---> Use case.

Video is a way more effective way to communicate than text. Not for the HN crowd, but if you're a blue collar worker a 2 minute video in your native language is much preferred to a 5 page pdf for training.

Anyone who has tried to record a simple corporate video know the pain of cameras, film crews, 25 takes to get one that works and post production. Cumbersome, slow and multidisciplinary. By the time the video is done the content is out of date.

Synthetic video is not yet at the quality of real video. Eventually it will be. But the mistake many are making here is comparing it to real video; it should be compared with text.

In X years we'll be able to make Hollywood films on a laptop without needing anything but time and imagination. Just like we can digitally compose music in Ableton, create images in Photoshop and type novels on keyboards rather than with pen and paper.

My (obviously biased;)) belief is that synthetic media will eventually become foundational technology that will move media production from cameras/microphones to API's. We'll be able to do all kind of things we couldn't do before.

Eg. personalized and interactive rich media, video-driven chatbots and eventually Hollywood blockbusters made by your favourite YouTuber from his or her bedroom.

---> Uncanny valley

Simulating real video is incredibly hard. We're constantly improving and launching more expressive synthesis soon.

From our tests with some of our largest clients 8/10 people don't realise it's a synthetic video (unless they are asked to look for it).

---> Tech

Has been developed over the last 3 yrs. Origins/team from Stanford/UCL/TUM.

Learning: Going from research to working, scaleable product is hard and takes time. But very rewarding when it works.

[1] https://www.youtube.com/watch?v=ohmajJTcpNk [2] https://www.youtube.com/watch?v=qc5P2bvfl44

---> Bad uses

Bad actors will do bad things with synthetic media. Like with any other technology from smartphones to cars. We're moderating all content and building safeguards and verification + working with FAANG and others on detection and provenance technology.

Recommended read - deepfakes perfectly follow the story arc of any new, powerful technology: https://journals.sagepub.com/doi/full/10.1177/17456916209193...

---> Actors

Real actors getting rev share + upfront free from every video generated with their likeness. Like being a stock photo actor.

1 comments

The Snoop Dogg advertisement rebranding case study was pretty impressive to me, since there were obvious savings from reuse. Neat to see how this technology could be integrated in a subtle way with other editing techniques.

It seems to me that this technology could have immediate application to dubbing over curse words in movies (since that's already done in a not so subtle way today).

The next step I see in that progression is full dubbing for translation, which already exists in a very conspicuous form. The old meme about out of sync karate movie dubs comes in mind.

How close do you think this technology is to use for syncing lips in Hollywood tier movie dubs using real voice actors? What are the main obstacles left to achieving that?