Hacker News new | ask | show | jobs
by mattmanser 2623 days ago
Surely a computer can do that in sub-second time? As soon as it sees the signature of a well known advert start blocking.

Then again, what about clips from the show that you're watching, it might start blocking part of the show as it thinks it's an advert.

5 comments

I think that you can just look for specific markers in the stream to detect ads. e.g. I look for SCTE-35 (https://en.wikipedia.org/wiki/SCTE-35) in Twitch streams to filter out their ads. It takes no CPU at all to do. I believe broadcast TV has a similar thing in many places.
What software do you use to do that?
Well, I watch Twitch using a personal fork of https://github.com/SebastianRask/Pocket-Plays-for-Twitch (which isn't really actively maintained), so when Twitch ads became in-stream this year, I replaced the default Android media player it uses with Google's ExoPlayer, and then made a bunch of changes to ExoPlayer to be able to detect when SCTE35 segments would be played, and silenced them.

I modeled my change on similar changes made by Streamlink's Twitch plugin recently: https://github.com/streamlink/streamlink/pull/2372

I have used this method with node.js+ffmpeg for Australian TV streams;

https://gist.github.com/satori99/2cb06938bfe8532ecc00e609065...

Seems like this is something machine learning would be very good at, given that commercial visuals look very different from “real” show visuals
> Seems like this is something machine learning would be very good at, given that commercial visuals look very different from “real” show visuals

This feature was in MythTV literally 10+ years ago. Commercials are very easy to detect using simple methods [0].

[0] https://www.mythtv.org/wiki/Commercial_detection

As far as I know audio levels in commercials are different too. I don't know the exact number, but I seem to recall reading some time commercials are usually 30% louder or something. They are definitely noticeably louder than most shows, except maybe sitcoms.

ETA: Looking it up it looks like America has laws about this now. I'm in Canada I'm not sure if it's the same here.

And on further reading, it seems like advertisers may be using excessive compression to artificially boost loudness and skirt the law.

There's volume and then there's "loudness."

Even if peak volume is under some prescribed maximum, compressed audio can feel very loud. Commercials often leave very little dynamic space, which makes them jarring in that they feel "loud."

Movies also have loud scenes which would false positive.
Why bother? There are already known ways to detect commercial breaks like so many milliseconds of black screen, and such. Also, there are very few commercials on television at any time, including local ads. You could just fingerprint them.
Just look for consistently louder volume.
Or check if the TV-Channel Logo is shown in the corner. At least in Germany the Logo is hidden during commercials.
Surely subtitles would be an awesome signature to use?
Plex says this about their commercial removing tech:

"The process is CPU-intensive and can take several minutes to complete, depending on the recording duration. On a reasonably fast CPU, we typically see a 30-minute recording take 2-4 minutes to process."

Assuming they know what they are doing, real time blocking seems not possible with a reasonably modern approach.

What leads you to believe that something that can process a 30 minute file in 3 minutes, can't be done in real time?
Maybe more context:

"The recording is analyzed on various characteristics such as black frames, silences and changes in aspect ratio.

Based on this information Comskip segments the recording in blocks and using heuristics, together with additional information such as the presence of logo, the scene change rate, Close Captioning information and other information sources Comskip tries to determine what blocks of the recording are to be characterized as commercials."

It doesn't say it explicitly, but based on this and the note on high CPU utilization, I get the impression it has to look back and forward to determine where the commercial is.

Given enough of a buffer and a historical corpus, it should be possible to do this in near-realtime, but likely with a few minutes of buffer. Especially with some kind of historical trained classifier, it should be possible to do this in close-enough-to-realtime-to-be-useful scale.