| HN Mirror

> I'm well aware. How does this help the attacker attacking the higher-entropy string I outlined?

Well, suppose the attacker is aware of your password generation method (e.g. it's in an open-source password generator, or you wrote down your method and someone stole the description). You have specified the generator as { 5/6 "password", 1/6 "hj5^@l2jl9GGk;Clkm(0]" }. In this case, the attacker will guess the password pretty quickly -- on the second guess at worst -- even in the 1/6th case that it is "hj5^@l2jl9GGk;Clkm(0]".

This is because the string "hj5^@l2jl9GGk;Clkm(0]" doesn't intrinsically have entropy. The generation method is what has entropy -- but in this example, not very much entropy, which is why you got hacked.

> How difficult is it for an attacker to attack a password consisting of four lower case english dictionary words?

It depends on the dictionary and the cost to guess a password. If you choose from, say, the 3000 most common dictionary words, then it will take the attacker 3000^4 = 81 trillion guesses to guess 4 of them. If the application has appropriately used salt and strengthening, such that it takes eg 10 core-ms to check a guess (with a function like argon2 that's annoying to run on a GPU), and the attacker throws 1000 cores at the problem, then this will take about 81e12 * 10e-3 / 1000 / 86400 / 365 = 25 years to exhaust the entire space, or half that on average.

Of course, the attacker could use more than 1000 cores, so this difficulty is surmountable, but it is pretty expensive to break. If your account is high-value, then 5 or 6 words would be a better choice. Also, if the service doesn't strengthen the password, and the attacker can acquire the hash, then 4 words is definitely not enough.

> I'm not sure who has dictated that this is supposed to be how entropy is used for password management.

I'm not sure what you mean by "supposed to be used" or "dictated". You don't have to use entropy to analyze password management, but it does make for a good analysis. The theory has been around for decades. See eg https://diceware.dmuth.org.

Theorem: if you sample a fresh secret (e.g. a password) from a distribution D of min-entropy x bits, and if an attacker then tries to guess it based on no other information (i.e. they might know D but they didn't like, already phish the secret), then in N guesses they will succeed with probability at most N/2^x.

Proof: By definition, the probability that any one guess is correct is at most 1/2^x, so the overall probability is at most N/2^x by the union bound. Easy peasy.

Note that this theorem does not hold if min-entropy is replaced by Shannon entropy, which is usually what people mean when they say "entropy" without qualifications. Note also that it makes no assumptions about character sets. The character set would only be relevant if each character were chosen iid, or if the attacker decides to attack the password as if this were so.