Low latency in Bluetooth audio comes down to codecs and the best are proprietary.
If you want to really cut down latency and need wireless with hardware like this, you could use a second ESP32 and send your own bitstream between them.
I've been experimenting with more-or-less this on the existing ESP32-S3 (well, to a smartphone/PC rather than a 2nd ESP32).
Practical bandwidth limits are in the ~72kb/s range with Bluetooth and a custom wire protocol, and Opus voice-mode encoding can't run in realtime beyond complexity 3; music encoding can't run at all. Maybe there's a more compute-friendly audio codec I'm not aware of, but as far as I know these chips just aren't quite powerful enough for high-quality music encoding, unfortunately. I'm hoping the S31 might be a bit better fit here (decent CPU boost + better SIMD).
Latency is still a bit rough with BT overhead. There might be some new options with LE audio on the S31 but I haven't found a way to get below ~80ms with the existing ESP32-S3 stack.
tl;dr, high quality voice is doable today with okay latency, music probably less so, maybe the S31 will be better
I haven't benchmarked Bluetooth on these devices but have you looked at your uncompressed audio rates over WiFi? OP was asking about Bluetooth at high quality and low latency, which I don't think is a possible combo, hence suggesting another ESP32 if wireless is necessary. If it isn't, a wire difficult to beat.
ESP-NOW is another option to look at, which of course won't work to transmit to a phone directly but can do a point-to-point or multicast transmission between ESP32 devices. I've used it in some projects but not for audio, I couldn't tell you how much of a buffer would be needed to make that work smoothly.
Another option for OP, if the audio is being synthesized then the parameters could be transmitted rather than the audio samples themselves and do synthesizing on the receiving device.
I have a (WIP) project that transfers audio over ESP-NOW. I haven't touched it in forever, but I remember it did work decently. I had to bring the audio sample rate down to 16kHz though, and it was just sending uncompressed audio. I probably could have dug more into configuring the radios for better throughput, or adding some basic compression to relax the bandwidth requirements.
Fair point, I’ve been evaluating WiFi for my project as well and the ESP32-S3 certainly has better link rates and latency, though I’ve not determined if it can truly run uncompressed. UX-wise my project is pairing to a smartphone so I’m not eager to require that users bring a WiFi network with them (something neither Android or iOS handle particularly gracefully), but regardless it would be good to have the option.
I was not aware that _any_ Espressif hardware even supported classic bluetooth other than the very first ESP32 (which I am not sure if they're even available). And I was getting around 50ms latency back then (with the original ESP32 and SBC!)
I’m using BLE GATT messaging with an upgrade path to L2CAP CoC channels for clients that support it. Roughly the path is: audio input -> opus encode -> BLE transmit -> smartphone/desktop. The latency floor ends up being ~80ms due to jitter buffer sizing, etc.
Is there any reason you want wireless? Bluetooth audio is a disaster, AFAIK. You don't want to use it for music. Just go wired, the ether is too cramped already.
There aren't many choices of cheap hackable A2DP receivers if you were somehow looking for one. Not all headphone chips are programmed to run at slightly under 3.3V on VBAT pins so to protect the supposed battery, with no means to reprogram for lower voltages, officially or not.
> I'm interested in audio out because I dabble in musical instruments.
Sorry, I don't know. I'm just responding to echo and expand on another reply that Bluetooth for anything related to serious music, from audio playback to MIDI input is a dumpster fire on Windows.
Several years ago I tried to set up a high-end Windows laptop for hobby DAW composition on the go. The real-world BT audio latency just from laptop to headphones/earbuds was unworkable and, separately, the input latency from BT midi controllers was unworkable. Stacked together the total lag was laughable.
At the time, the issues were widely known and much lamented. Some tech blogs (including one at MSFT) indicated there were issues at every level of the stack (drivers, firmware, silicon) and work was proceeding to address the end to end shit show. The only workable Windows solutions referenced online involved using specific non-Bluetooth wireless devices. Needing to have a dedicated USB dongle hanging off the laptop combined with having a choice of either one specific device or a receiver dongle to support all devices, is less appealing than just having a wire.
Since then I've looked again every year or so but have seen no reports yet of meaningful progress and there's even less discussion of work in progress. Very disappointing. And the situation on the BT audio quality side doesn't seem much better. If you don't want degraded audio quality it's either choosing very specific devices which support a proprietary BT codec or switching to non-BT wireless dongle hardware. At least there is talk of improvement on audio quality but no clear indication better baseline minimum audio quality will ever be mandated in the BT audio standard.
If anyone has info the baseline latency or quality (input or output) of standard BT devices in Windows configs will improve, I'd be delighted to hear it.
I'll mention that you usually need to put the BT connection in a low latency audio profile or else you're likely to get something more suitable for mp3-style high buffer playback.
Thanks for the tip. I'm about to revisit this again soon (new laptop + some free time for fun). Have you been able to get BT latency low enough on Windows to hit a MIDI key and hear the note without noticeable lag?
It's been a few years since the last time I actually tried it myself, instead of just checking user reports. I do remember I followed the DAW company's FAQ, set a mode in the DAW, and switched something in Windows settings related to BT. The wired latency (MIDI in and audio out) was excellent but switching either to BT tanked it.
It's frustrating that it appears to not have improved at all in a decade. I get the whole "good, fast, cheap" triangle and that most of the BT ecosystem only cares about "cheap" while being just good enough 128Kbps MP3s don't sound too much worse on $50 cuff link-sized earbuds. But I can't help naively thinking that on decadal timescales, the rising tide should lift even the "cheapest" corner of the triangle enough to yield slightly better minimum baseline quality - especially when it's been stuck forever at barely usable. Even more surprising is that BT gaming controllers still have such high latency, most BT controllers also come with a proprietary wireless dongle. Talk about pointless COGS and landfill.
I guess maybe the reason is those who really care can go wired, use non-BT wireless dongles or lock themselves to a proprietary vendor who controls both ends of the stack. But it kind of nerfs the point of having a short range wireless "standard" if doesn't cut COGS, landfill waste and never improves more 'serious' use cases even a little.
Espressif products are not ideal for Bluetooth audio since support for classic Bluetooth (which is what is still mostly used for Bluetooth audio) is hit or miss , and on newer models often entirely missing.
If you want to really cut down latency and need wireless with hardware like this, you could use a second ESP32 and send your own bitstream between them.