Hacker News new | ask | show | jobs
by hedora 108 days ago
So, the false negative rate was 84%, but what was the false positive rate?

They have a table "AUTOMATIC SCAN RESULTS (263 URLS)" that sort of presents this information. Of the 9 sites that were negatives, they say they incorrectly flagged 6 as phishing.

With a false positive rate of 66%, it's not surprising they were able to drive down their false negative rate. Also, the test set of 254 phishing sites with 9 legitimate ones is a strange choice.

(Or maybe they need to work on how they present data in tables; tl;dr the supporting text.)

1 comments

The false positive rate was 66% for "automatic scan" and 100% (!) for "deep scan".

In other words, you can get these numbers if your deep scan filter is isSuspicious() { return true; }.

Brb, applying for YC funding for my new AI-based phishing detection system.

(‘return true’ is just a very optimized neural network after all!)

I think there might be a confusion here? The 100% seems like the true positive rate (correct detection), not the false positive rate?
Nope, 9 of 9 legit sites were incorrectly flagged:

> The tradeoff is that it flagged all 9 of the legitimate sites in our dataset as suspicious

Sorry, I think I had my wires crossed somewhere. Yeah, I see now. That's crazy/hilarious.