| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by candiddevmike 1550 days ago
	So the folks at the forefront of deep fake technology (i.e. the attackers you're targeting) will slip through your product because it lags behind the state of the art (like AV, which you said is the approach you're following), while innocent folks will be caught by it due to a new kafkaesque version of "prove you're not a bot" since you focus on reducing false negatives. Hopefully I can avoid companies using your product.

2 comments

btown 1550 days ago

Retrospective antivirus-esque techniques are still useful, though, as not every actor is a state-level actor, and even then, forcing state-level actors to "burn" their state-of-the-art exploits/models because previous exploits/models are detected out-of-the-box, slows down the abuse of those actors.

And realistically, since deepfake detection will inevitably be more expensive than captchas or antivirus scanning, this will be adopted by human-in-the-loop organizations for critical processes where threat scoring or moderation is already being applied.

That said - Reality Defender, please train your system on diverse human data sets, do not release models where ethnicity or gender (including gender identity) are nontrivially correlated with deepfake score, and have processes in place from day 1 to allow users to report suspected patterns of bias. The kafkaesque "prove you're not a bot" scenario envisioned by the parent poster is one thing for holistic human-in-the-loop verification processes, and another thing if it suppresses minority voices and minority access to government services.

link

bpcrd 1550 days ago

We agree. Dataset fidelity and bias are major concerns for publicly available datasets. For this reason we are working to develop programmatically created datasets along with anti-bias testing and policies.

link

prometheus76 1550 days ago

"Bias" and "anti-bias" is a slippery snake that will bite you as soon as it warms up to you.

link

calvinmorrison 1550 days ago

Of course, because these companies are probably owned by the same people in the end that develop the DeepFake datasets, generating endless income from both sides.

It's like ADA Compliance lawsuits. I can't prove the AccessaBe or other "ADA Compliance" web tooling are generating these lawsuits, but their company would not exist without them. Why wouldn't they want more lawsuits?

link

ChefboyOG 1550 days ago

The majority of large, popular datasets in deep learning are curated and hosted by academics:

https://paperswithcode.com/task/deepfake-detection#datasets

link

rdrdg 1549 days ago

Thanks, yes, we benchmark on these research datasets as well.

link