|
"It turns out that the round-trip time from an audio interface, through a computer (DAW) and back to the speakers takes a few hundreds of milliseconds, making direct audio processing impossible using consumer hardware." - uh, what? Real-time audio processing has been a thing for at least a couple of decades. It doesn't work by default on Windows, but you can get free drivers (ASIO4All) which make it work on pretty much any hardware. And it works out of the box on Macs. "Latency seems to shift by a few tens of milliseconds when restarting the application." - this makes me think you are using the wrong API for your sound input/output. With modern realtime audio support, your total latency from input to output should be less than 10ms total. "I expected that memory usage would get out of hand quite fast due to the ever growing dictionary of arrays containing audio data, but this does not happen in practice. I suspect that the good performance is caused by highly optimized memory management of Python and modern OSes." - without concrete figures it's quite hard to evaluate this, but what did you expect to happen? With a 44.1KHz stereo audio stream, you should be storing 88.2 thousand samples a second. Say you're using 64-bit floats, as a worst case. Your audio storage should be growing at about 689KB/sec, plus a bit extra for object overhead. How much is it actually growing by? Of course Python is probably doing a bunch of allocation and deallocation for temporary objects behind the scenes, but hopefully you should not need to lean too hard on 'highly optimized memory management' - ideally, you should hardly be allocating anything at all. Also, why a dict, rather than just a large array that you can occasionally make bigger? Finally ... I'm sure you already know that Python is possibly the worst mainstream language you could pick for realtime audio processing. But that is fine. I have tried to build audio stuff in Python too! Sometimes using the wrong tool for the job is part of the fun. |