Hacker News new | ask | show | jobs
by MKinley 2057 days ago
I am not in law enforcement myself, but do some online investigations, and like many others in my field we use YouTube-dl to save a copy of video evidence relevant to a case we are working on. It can be instrumental for archiving the evidence if it is ever removed or taken down, can also be used to grab extra info (CC text log is one example) and then manually searched for specific strings. This tool has made ma y investigations pay off in ways they never could have otherwise.
1 comments

How is this significantly different than placing a video camera in front of a screen?

Yt-dL is surely nicer and more convenient, but from an evidence standpoint, a one time $500 (to go wild) setup should have you covered, no?

(Mourning the loss of ytdl myself but trying to be realistic)

Eek, the thought of that makes me cringe. I guess it would be passable in some circumstances, but I can imagine a bunch of reasons why it would be miserable or useless:

* the quality drop; recompressing, non-matching frame rates, non-matching resolution--all the same goes for audio. You're very likely clipping (losing) data. This is assuming you're doing screen-capture. If you're literally video taping a monitor you will get moiré in the video, room tone in the audio, losing any stereo separation, and other audio/video artifacts.

* performance; must be done in real-time, cannot queue up multiple sources. This is likely the biggest efficiency killer and makes things 100x more labor intensive.

* reliable Internet; if you get a blip or have a slow connection you have to hopefully catch it and start over. With youtube-dl you can pause, resume, confirm even on the slowest, spottiest connections.

* metadata, organizing, indexing; likely hand-typed separately, prone to error, prone to not knowing if you've done that video already.

* Chain of custody; grabbing the original video allows you to prove two identical copies match (using file hashes or other comparisons) screen recording makes that difficult to impossible to confirm--maybe with fancy AI you'd have to run by the courts?

Eek, you are responding to my comment as if it was a freestanding response about archival copy and law enforcement work, when it was specifically in response to someone saying he was using it for neither. It's not surprising it makes you cringe, but please consider in context.

> the quality drop; recompressing, non-matching frame rates, non-matching resolution--all the same goes for audio.

Are you trying to preserve quality or prove something? My response was in context for "gathering evidence" but not police work, and not archival quality. Would such a copy cause your problem to prove libel, copyright infringement, illegitimate disclosure, etc?

> performance; must be done in real-time, cannot queue up multiple sources

Can most definitely queue up multiple sources. Just make a youtube playlist and record it. Yes, it takes "real time latency", you'll take 10 hours to download 10 hours of video in general -- that's not an issue for evidence or gathering in a non law-enforcement context.

> metadata, organizing, indexing; likely hand-typed separately, prone to error, prone to not knowing if you've done that video already.

Again - consider the context of my answer, NOT archival quality anything. The "cc" stream GP mentioned, which can be searchable etc - has also seen many revisions for many files when the Google STT algorithms are revised, and with corrections.

> Chain of custody; grabbing the original video allows you to prove two identical copies match (using file hashes or other comparisons) screen recording makes that difficult to impossible to confirm--maybe with fancy AI you'd have to run by the courts?

You have no chain of custody. You can prove two downloads are the same, but YouTube does not guarantee they keep the file the same (indeed, they've modified files several times, changing formats and even remastering old '80s videos). If a file is later pulled (which is what GP was talking about), what are you going to compare it to?

Chain of custody is law enforcement business. They'll get the files from YouTube directly, with affidavits and statements about it and any modifications, if they need it in court. You are going to civil court, and youtube-dl is not making your evidence more valid than a screen recording.

Is it possible to replace youtube-dl with a video camera in front of the screen? Sure.

Did I expect anyone, ever, to propose that? Let's just say, 2020 is full of surprises. I want off this wild ride.

This, along with screen capture, is known as a form of rebroadcast and it's used to obscure and obfuscate digital alterations, watermarks, deepfake artifacts and the like. When doing media forensics, it's optimal to get as close to the raw source as possible.
Isn't this just due to more compression? What's stopping someone from turning down the bitrate and re-encoding a video into different formats a few times to kill the quality (which will still look fine on a 6 inch phone in portrait orientation)?
That's one of the effects we model in fact, "social media laundering" is the term of art.

Various detectors are more or less thwarted by it. It actually surprised me how strong the artifacts from some GANs are - they can survive several passes of re-encoding, but accuracy does suffer.

But I still need the raw 720/1080 stream for training.

Nothing. But his point is that raw files are better than screencapture/record
I'm afraid to ask how you take screenshots.
With the appropriate tools, such as gnome-screenshot.

But when I find myself on a locked down computer - e.g. watching a movie on an AppleTV, or when I was shown surveillance video but was refused a copy for some bureaucratic reason (or it required a different license to export, reasons were unconvincing) - I use a mobile device.

Yt-dl lets you download whole accounts or playlists.
Using a screen recorder like shadowplay or obs also doesn't seem like it would be out of the realm of possibility.
I think this is absolutely reasonable question.

Filming the screen, means that in order to fake it, you have to setup something that routes youtube.com to your own fake version of youtube, before filming. To me, that sounds much harder than say "this file was downloaded from here on that date"

"oh nice, a youtube-link from one of my sources, let me get my camera set up to archive it ..."

I kind of expect a serious investigator to archive these materials just for the sake of it. I don't expect them to make it harder on themselves for no good reason.

> Mourning the loss of ytdl myself but trying to be realistic)

It's just a GitHub repository which is lost, not ytdl itself.

So - if the issue is really the marketing around youtube-dl, does this mean someone can create a fork named something else, use different marketing, and carry on?
I can see no reason why not. Just don't put it on GitHub, put it on a private hosting in a country unfriendly to copyright trolls.
Obviously CC stream is not readily searchable if you record it with a video camera.
Well, sure, but google/YouTube search does find in it. The GP was talking about their work collecting evidence - they can find it just as well, and record a copy for posterity just as well.

I am not saying it’s as convenient (more options >> less options except for analysis-paralysis). But I don’t understand how it tips the scale to making any archiving or evidence gathering unusable or uneconomical. (I am not saying GP is wrong - just want more explanation so I can understand)