|
|
|
|
|
by x1798DE
4394 days ago
|
|
Well, he's saying that he has a sample of 40k hackers' passwords stored up somewhere, and between them there are 2000 unique strings, ~1200 of which were in plain text and didn't need to be cracked at all. So if this sample of 40k hacker passwords is a random sampling, then essentially he has a random unbiased sample of 1200 unique passwords, plus a biased set of 300 more. He's not super clear about where the 40k passwords came from, so they may be a random sample, but it's quite possible that it's just a sampling of bad hackers - he mentions that he has gathered many examples of bots and shells and such, so you can imagine that he's looking at a sampling of 1. hackers whose bots store their passwords in such a way that he can reverse-engineer where they are stored and 2. hackers who store their passwords in plain-text. That said, if he has 40,000 passwords that boil down to 2000 unique strings, of which only ~400-500 are either good passwords stored in plaintext or not easily crackable, then that means about 35,000 out of the 40,000 passwords he captured were easily guessable (I'm assuming here that there were no duplicates in the "good" password set), which is about 87.5% of his sample. |
|
Yes, that's basically my point. The set of hackers who use strong passwords and the set of hackers who don't well-protect those passwords in their bots/viruses/whatever probably doesn't have a lot of overlap.
Also, it sounds like he couldn't crack (and thus couldn't include in the sample) some of the hashed passwords. Passwords that he can't crack or brute-force reasonably are probably strong passwords. Not having those passwords biases the sample - it's like doing a standardized test when all the honors classes are on a field trip, by removing the top-end you downward-bias the sample and make the overall sample look worse.