Hacker News new | ask | show | jobs
by oritron 12 days ago
Low latency in Bluetooth audio comes down to codecs and the best are proprietary.

If you want to really cut down latency and need wireless with hardware like this, you could use a second ESP32 and send your own bitstream between them.

2 comments

I've been experimenting with more-or-less this on the existing ESP32-S3 (well, to a smartphone/PC rather than a 2nd ESP32).

Practical bandwidth limits are in the ~72kb/s range with Bluetooth and a custom wire protocol, and Opus voice-mode encoding can't run in realtime beyond complexity 3; music encoding can't run at all. Maybe there's a more compute-friendly audio codec I'm not aware of, but as far as I know these chips just aren't quite powerful enough for high-quality music encoding, unfortunately. I'm hoping the S31 might be a bit better fit here (decent CPU boost + better SIMD).

Latency is still a bit rough with BT overhead. There might be some new options with LE audio on the S31 but I haven't found a way to get below ~80ms with the existing ESP32-S3 stack.

tl;dr, high quality voice is doable today with okay latency, music probably less so, maybe the S31 will be better

I haven't benchmarked Bluetooth on these devices but have you looked at your uncompressed audio rates over WiFi? OP was asking about Bluetooth at high quality and low latency, which I don't think is a possible combo, hence suggesting another ESP32 if wireless is necessary. If it isn't, a wire difficult to beat.

ESP-NOW is another option to look at, which of course won't work to transmit to a phone directly but can do a point-to-point or multicast transmission between ESP32 devices. I've used it in some projects but not for audio, I couldn't tell you how much of a buffer would be needed to make that work smoothly.

Another option for OP, if the audio is being synthesized then the parameters could be transmitted rather than the audio samples themselves and do synthesizing on the receiving device.

I have a (WIP) project that transfers audio over ESP-NOW. I haven't touched it in forever, but I remember it did work decently. I had to bring the audio sample rate down to 16kHz though, and it was just sending uncompressed audio. I probably could have dug more into configuring the radios for better throughput, or adding some basic compression to relax the bandwidth requirements.

Code is here:

https://github.com/bschwind/walkie-talkie

Fair point, I’ve been evaluating WiFi for my project as well and the ESP32-S3 certainly has better link rates and latency, though I’ve not determined if it can truly run uncompressed. UX-wise my project is pairing to a smartphone so I’m not eager to require that users bring a WiFi network with them (something neither Android or iOS handle particularly gracefully), but regardless it would be good to have the option.
I was not aware that _any_ Espressif hardware even supported classic bluetooth other than the very first ESP32 (which I am not sure if they're even available). And I was getting around 50ms latency back then (with the original ESP32 and SBC!)

What exactly did you try?

I’m using BLE GATT messaging with an upgrade path to L2CAP CoC channels for clients that support it. Roughly the path is: audio input -> opus encode -> BLE transmit -> smartphone/desktop. The latency floor ends up being ~80ms due to jitter buffer sizing, etc.
While this explains the bw limit, I'm still surprised re latency, it sounds really bad even for L2CAP.
You don't just use four simultaneously connected audio profiles and interleave them?