|
|
|
|
|
by blake8086
5179 days ago
|
|
This seems like a great step forward, but it's still a bunch of ad-hoc rules. While the ruleset is definitely well-put-together and fairly comprehensive, it still doesn't seem like the most accurate measure. It seems like password strength basically boils down to: 1) imagine the space of all possible passwords 2) put them in order from most to least likely (123456 would be at the top, some giant 64 character random monster at the bottom) 3a) if you're malicious, use this list to begin cracking 3b) if you're securing something, use this list to measure strength An ideal password strength measurer would simply return the approximate rank of your password. |
|
The most obvious way to do password strength checking would not (I don't think) let you "use this list to begin cracking", but would instead estimate the Kolmogorov complexity of the password, as a proxy for its entropy.
That sounds daunting, but it's actually pretty simple in principle: append the password to a couple concatenated dictionaries plus popular password files, and see how much it compresses with your favorite zipping algorithm. Compare it to how much 'password1' zips, because you know that's the first one that they try and therefore it has complexity 2^0. If the zipping algorithm is good, it will automatically figure out most of these tricks directly from the 'bad password lists'.
I would say a little more: it is probably the case that you can "steal" the dictionary from one zipping and force a zipping algorithm to use that dictionary. If this is the case, the dictionary needed reduces to 64 KiB (I believe) rather than the 600 KB that the above script requires. I don't know how much effort it takes to get zlib-with-a-static-preset running in JS but then again, I don't know how much time it took zxcvbn to reach its final form either.
Using /usr/share/dict/american-english for my dictionary is a bit crap because it does not yet speak l33t, but my dictionary can be used for "correcthorsebatterystaple". XKCD estimates 44 bits = 5.5 bytes; gzip --best estimates 7 bytes, maybe more if we had more sequences ending in '1' to better compress 'password1'. (Some extra bits are to be expected purely due to the diverse number of password-guessing algorithms; 'switch one character to l33t, switch two characters to l33t, end with a number' offer a couple extra bits which XKCD ignores in order to establish a lower bound.)