Just run a compressor! Like the Loudmax plugin if your audio source is a PC, or buy a Behringer CL9 ($30) and plug into your aduio chain if you don't want a computer involved. Problem solved.
This looks like someone using vibe coding where a web search would suffice.
I also fail to see how an ML model is necessary for what is essentially an AGC (Automatic Gain Control). Even something fancy following the EBU R 128 standard like https://lsp-plug.in/?page=manuals§ion=autogain_stereo will be orders of magnitude more performant.
I do find the "out-of-band" solution of adjusting volume with an IR remote interesting but this thing is so vibecoded to hell and back and there is no source code to speak off... hard pass.
So much LLM marketing/hype speak in the readme, so many moving parts, failure points, wasted CPU cycles, added latency, not to mention the externalities of the tokens that were burned on this...all to do what even the cheapest audio processors are able to do.
Back in the day you could dismiss all of that as "it's part of the learning path" and yes, I made over-engineered non-solutions to long-solved problems when I started programming too. But this isn't learning. It's pure LLM slop.
LLMs are doing the thing that greedy/unethical programmers used to. They'd quote a client a dozen microservices and four months of work for something that could be solved by writing a slightly longer Excel formula. Not that the quote was crazy or the work was poorly done, it just wasn't anywhere near necessary to solve the problem. But they got paid and the client was happy because they didn't know any better...until I showed up to ruin the fun.
For a long time I was bothered by how modern TVs still struggle with stable audio levels.
Commercials are louder, dialogues are quieter, and sudden spikes can wake up half the house.
Even with built‑in “volume leveling”, the problem persists — especially during ads and streaming.
After trying different solutions over the years, nothing worked reliably.
So I built a small offline system that listens to TV audio in real time, detects sudden loudness spikes, and sends IR volume adjustments locally.
No cloud, no APIs, no smart‑home ecosystem — just local audio analysis + a simple IR blaster.
The hardest part wasn’t the code, but figuring out when audio truly requires intervention.
That led me to design a small signal‑behavior model that looks at micro‑events, context and dynamics before deciding whether to react.
The prototype is now stable enough for daily use. Not perfect yet, but the foundation works.
This looks like someone using vibe coding where a web search would suffice.