|
|
|
|
|
by makeitdouble
206 days ago
|
|
I've been in many situations where I wanted translations, and I can't think of one where I'd actually want to rely on either glasses or the airpods working like they do in the demos. The crux of it for me: - if it's not a person it will be out of sync, you'll be stopping it every 10 sec to get the translation. One could as well use their phone, it would be the same, and there's a strong chance the media is already playing from there so having the translation embedded would be an option. - with a person, the other person needs to understand when your translation in going on, and when it's over, so they know when to get an answer or know they can go on. Having a phone in plain sight is actually great for that. - the other person has no way to check if your translation is completely out of whack. Most of the time they have some vague understanding, even if they can't really speak. Having the translation in the glasses removes any possible control. There are a ton of smaller points, but all in all the barrier for a translation device to become magic and just work plugged in your ear or glasses is so high I don't expect anything beating a smartphone within my lifetime. |
|
Aircaps demos show it to be pretty fast and almost real time. Meta's live captioning works really fast and is supposed to be able to pick out who is talking in a noisy environment by having you look at the person.
I think most of your issues are just a matter of the models improving themselves and running faster. I've found translations tend to not be out of whack, but this is something that can't really be solved except by having better translation models. In the case of Airpods live translate the app will show both people's text.