Depending on your hardware, the model is definitely real time (able to transcribe audio faster than the length of the audio).