Hacker News new | ask | show | jobs
by phenkdo 1707 days ago
can you explain how you are hashing the video? i took a quick look at the github and don't see details...
3 comments

They're extracting video frames at some interval (default of 1 second) as 144x144px stills and then turning them into a square collage. That collage then has a perceptual hash performed on it.

The major problem here is two videos with the exact same content but slightly different times (say one with a couple second intro) will rarely if ever have a positive match.

The only cases I see where this particular scheme helpful is where you've got videos with the same contents but different encodings. The length will be the same but quality between two encodings (and names) might be different. This would help you find them in a sea of files.

A simple improvement would be to only check the frame from the middle of each video first. If the frame at the same time stamp are the same in one part you've got a non-zero probability of a match. Then you can attempt to check more frames radiating out from the center point. Negative matches will fail fast and save you work. It also matches when the lengths are dissimilar because of trims or splices at the beginning and end of the videos.

A second improvement would be to pick a frame from the A video and scan through the B video (or segment of each) to find a high probability match. Then check other segments of the video for matches in the same way.

Trying to turn a video into a single static representation and comparing it is not the best.

Wouldn't it make more sense to convert the video to greyscale and e.g. detect significant changes of brightness during frames and store them as vector coordinates (% of the playtime, brightness delta)?
That could work. But I think limiting your search to brightness patterns is going to make for a lot of false positives. The brightness search might make for a good first pass to find a subset of the corpus for a more in depth search.
I think it creates a collage of the video frames: https://github.com/akamhy/videohash/blob/8759b6ad7fdabcdf4dd...

and passes that on to the videohash.py module to generate a hash: https://github.com/akamhy/videohash/blob/main/videohash/vide...

by using the library imagehash: https://pypi.org/project/ImageHash/

It's open source ;)

Apparently, they extract video frames using FFMPEG, create a collage out of those frames, then use the whash method of the python imagehash package.

So it's basically reducing video hashing to image hashing, which was previously solved.