| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by camtarn 653 days ago

"It turns out that the round-trip time from an audio interface, through a computer (DAW) and back to the speakers takes a few hundreds of milliseconds, making direct audio processing impossible using consumer hardware." - uh, what? Real-time audio processing has been a thing for at least a couple of decades. It doesn't work by default on Windows, but you can get free drivers (ASIO4All) which make it work on pretty much any hardware. And it works out of the box on Macs.

"Latency seems to shift by a few tens of milliseconds when restarting the application." - this makes me think you are using the wrong API for your sound input/output. With modern realtime audio support, your total latency from input to output should be less than 10ms total.

"I expected that memory usage would get out of hand quite fast due to the ever growing dictionary of arrays containing audio data, but this does not happen in practice. I suspect that the good performance is caused by highly optimized memory management of Python and modern OSes." - without concrete figures it's quite hard to evaluate this, but what did you expect to happen? With a 44.1KHz stereo audio stream, you should be storing 88.2 thousand samples a second. Say you're using 64-bit floats, as a worst case. Your audio storage should be growing at about 689KB/sec, plus a bit extra for object overhead. How much is it actually growing by? Of course Python is probably doing a bunch of allocation and deallocation for temporary objects behind the scenes, but hopefully you should not need to lean too hard on 'highly optimized memory management' - ideally, you should hardly be allocating anything at all. Also, why a dict, rather than just a large array that you can occasionally make bigger?

Finally ... I'm sure you already know that Python is possibly the worst mainstream language you could pick for realtime audio processing. But that is fine. I have tried to build audio stuff in Python too! Sometimes using the wrong tool for the job is part of the fun.

4 comments

PhunkyPhil 653 days ago

+1. In Ableton on Windows you can get your latency down to ~40ms without a dedicated sound card using ASIO. Mac's drivers are even better with sub ~20 ms on my m2 pro IIRC.

shannonclaude 653 days ago

+1 to the comments here. Part of the issue here is running these applications in Python. It's not really optimized to handle these loads and do DSP-based compute efficiently.

jancsika 653 days ago

> Mac's drivers are even better with sub ~20 ms on my m2 pro IIRC.

Just to be clear that you're measuring apples to apples with OP:

You are measuring less than 40ms roundtrip latency on your Mac. Is this correct?

emptiestplace 650 days ago

You seem surprised, but any sort of live production requires this. Check out SonoBus, it achieves adequately low end-to-end latency even with network delays in the mix.

CyberDildonics 652 days ago

When you see someone using python for something as real time and latency sensitive as audio don't you expect more wacky red flags on top of the fact that python is going to 50x to 100x slower than a native program?

Crazy numbers on top of a dictionary of arrays? It's all there.

spmvg 652 days ago

Half the fun of this project has been doing it in Python and performance hasn't been an issue so far, which says something about how fast Python is already. And indeed native would be ~50x-100x faster.

I must defend the design choice for the dictionary of arrays though, this has been a very conscious choice:

  - The "dictionary-of-arrays" approach allows lookups in constant time O(1), irrespectively of how much data has already been stored (compared to one big array)
  - The dictionary structure allows me to throw away data in the middle easily (without having to handle growing arrays), because the "dictionary-of-arrays" has already been chunked. The audio looper will use only some parts of the recorded audio, leaving big parts in between unused.

camtarn 652 days ago

Not necessarily. "I convinced this language/system to do something it really wasn't designed to do by optimising everything" is a well established genre of article :)

spmvg 652 days ago

:)

pjmlp 652 days ago

It is really a hammer for a screw.

spmvg 653 days ago

Interesting comment! I'm going to figure out if using another driver allows me to get under 20 ms in latency. Right now I'm measuring around 300 ms in latency round-trip, which is not a problem because I can correct for it. (I'm using a Focusrite Scarlett 2i2 with default drivers.)

The reasoning behind my comment about round-trip time was as follows:

  - Right now I'm measuring around 300 ms round-trip time, without processing inbetween
  - In the past I've tried to do live effects in Ableton with ASIO drivers (guitar in -> Ableton effects -> out), and the delay was too noticable. I couldn't play that way without making my ears bleed and I've switched back to pedals since.

One follow up: how could I achieve a total round-trip latency of around 10 ms total, as you describe? If I use a buffer of 500 samples @ 44.1 kHz, then I am spending already 11 ms just filling the buffer. So then the buffers need to become really small, causing more processing overhead, right? Not sure if this is the way to go.

camtarn 653 days ago

Yeah, your Scarlett should be capable of single-digit ms latency. If you're on Windows, you need to install its ASIO drivers and figure out how to use them from Python. Then, yes, use tiny buffers and run your audio processing very fast - which is where Python's slowness will probably become a real problem.

10ms latency is how long sound takes to travel 3-and-a-bit metres. So if your amp is a few metres from you, you would experience that delay between hitting the guitar strings and hearing the amplified sound. This should barely be noticeable. If you were noticing a delay greater than that in your Ableton effects setup, your settings needed tweaked. All of this is completely possible - I had a PC-based electronic drum setup in 2006, running through the Reason DAW, which had 8ms latency between hitting a pad and hearing the result.

Hmm, I wonder if Cython (static Python-to-C compiler) would make writing audio code easier/more possible?

spmvg 653 days ago

With Ableton and the default ASIO configuration on my Scarlett I get 96 ms combined input+output latency without any processing in between, so that's probably what made my ears bleed before. Tweaking the sample rate and buffer size gets me indeed single digit latencies in Ableton. So I'm definitely going to adjust the section about latency, thanks for this!

I'm a bit on the fence about what this means for the difficult latency calibration routine in the application. Ideally I could throw the calibration routine away, but then I require that users have ASIO installed, while the app now also works with non-ASIO drivers. And indeed Python itself might become a bottleneck (making this work in Python has been half the fun).

dist-epoch 653 days ago

Even without ASIO you should be able to hit 40 ms latency on pretty much any Windows audio hardware, including motherboard built-in.

If you get 300 ms you're doing something wrong. Note that Windows has multiple audio APIs, 300 ms is about the latency of the old MME api, you need to use the newer one, WASAPI.

spmvg 653 days ago

I apparently only have the old Windows MME drivers indeed (and ASIO, on Win10). Need to look into why I can't find WASAPI and if I can assume other Windows users have those by default.

michaelrmmiller 653 days ago

WASAPI has been available since Windows Vista. It isn’t its own set of drivers but rather a unifying layer for the WDM driver and the preceding mishmash of Windows audio APIs (MME, DirectAudio, etc). WASAPI supports low ish latencies with Exclusive Mode and then something like 10ms buffering in Shared Mode through the Windows audio server, I recall.

Put another way: any Windows audio device supports WASAPI unless it only ships with an ASIO driver which is unlikely, even in the pro audio space.

sim7c00 653 days ago

try clarett interface. it also comes with pre amps which will make your sound less noisy , scarlet preamps are just absolutely terrible. you can debug your daw to see how it uses drivers and make a python module which exposes similar functions to python. you will likely still want a delay compensation to make things seem free of any latency, but it will be doing _much_ less compensating. maybe theres an opensource daw if you want to skip reversing driver calls from a debugger.

spmvg 653 days ago

Debugging an existing DAW to see how they do it under the hood is an interesting idea. Haven't done that yet.

About another interface: I do want to keep the application supporting cheaper interfaces such as the Scarlett, because the target audience (hobby musicians) will be using those. Still would be a nice upgrade for me!

kibibu 653 days ago

Can take a peek at how Tracktion engine does it too

ubercore 653 days ago

I don't know windows audio, but on mac audio that's wildly high latency for a scarlett interface.

bongodongobob 653 days ago

I would disable any services and programs running in the background as well. Years ago I disabled the Windows print spooler and it greatly improved jitter. Not sure if that's still the case these days though, that was probably 10 years ago.

spmvg 653 days ago

So far CPU usage hasn't been an issue at all (<1% usually on my not-very-impressive laptop), which surprised me as well

pjmlp 652 days ago

The damage that Python not having a JIT has done.

At least BASIC was designed for native code compilation from day one, and after the 8 bit home computers generation passed by, getting compilers for 16 bit home computers was rather easy.

30 years later, people insist in using bytecode interpreted language for the wrong use cases.

abdulhaq 652 days ago

What about psyco? Anyway it's a very odd take that the early development of python should have been concerned with a jit. There were many far more pressing issues at the time.

pjmlp 652 days ago

Python exists for 33 years....

abdulhaq 652 days ago

Yes and I've been using it for a lot of that time. Maybe you too. At the time tools like psyco were useful but never got enough traction to persuade core developers that it was a compelling direction. It never felt like an obviously wrong decision.

pjmlp 652 days ago

I only use it as Perl alternative for UNIX scripts, nothing else, unless forced otherwise.

There are enough alternatives with the same dynamism and native code generation.