| I just tested the model [1] using an RTX3090, trying to translate a french text I found here [2]. Some observations: - The full translation of the 6:22 minute video takes about 22 seconds (17x real time) - It recognizes the language by default (and did a good job to recognize it was french audio) - MIT License [3]! - The quality of the transcription is good, but not perfect. - The quality of the translation (if you don't consider transcription errors as a translation error) is generally very good. --- The transcription: > Bonjour à tous, <error>j'suis</error> espère que vous allez bien, c''est ENTI. Et aujourd', <error>aujourd',</error> on se retrouve <error>un peu physique</error> pour parler de la termo dynamique. Vous ne vous inquiétez pas, ça va bien se passer. On va y aller ensemble, <error>être à par exemple,</error> je vous accompagne à travers une série de vidéos pour vous expliquer les principes de base en termo dynamique. Et bah, c''est parti, on va y aller tranquillement. Lidée, c''est vous puissiez comprendre la termo dynamique dans son ensemble. Donc, je vais vraiment prendre mon temps pour <error>couplisser</error> bien comprendre les notions, The translation: > Hello everyone, I hope you're doing well, it's NT and today we find ourselves a little physical to talk about the thermo dynamic. Don't worry, it's going well, we're going to go together and be the same. I'm going to accompany you through a series of videos to explain the basic principles in thermo dynamic. Well, let's go, <error>we're going to go quietly</error>. The idea is that you can understand the thermo dynamic <error>in sound together</error>. So I'm really going to take my time to understand the notions, --- All in all very happy that OpenAI is publishing their models. If Stable Diffusion is any guide, people will hack some crazy things with this. [1] https://github.com/openai/whisper
[2] https://www.youtube.com/watch?v=OFLt-KL0K7Y
[3] https://github.com/openai/whisper/blob/main/LICENSE |