Hacker News new | ask | show | jobs
by salmonellaeater 4675 days ago
> "This is an answer to the batteryhorsestaple thing."

Steube misunderstands the xkcd comic [1]. There's a really good comment which explains it: "It could be argued that Randall's example of 4 words is too short -- and indeed, for some applications, it is. However for a typical dictionary size, and genuinely random selection, it is massively stronger than "typical" passwords and in fact easily adequte to defeat the above-mentioned attacks." [2]

Emphasis on "genuinely random selection."

[1] https://xkcd.com/936/

[2] http://www.schneier.com/blog/archives/2013/06/a_really_good_...

2 comments

What makes you think he misunderstands it? For the cracker it's not about entropy per se, it's a game to come up with algorithms that crack more passwords for less compute power. The XKCD comic got a lot of mindshare so it makes sense to target algorithms towards that type of password.

I think Schneier's suggestion of reducing it to the first letter of each word is vastly preferable because it packs the majority of entropy from random word selection into the least amount of typing.

The algorithm is not targeted against the type of password which the XKCD comic suggests. The algorithm is designed to exploit common human behavior, which is similar to the XKCD method but not identical. The significant difference is that human behavior in picking words is not random, while the XKCD method requires the word selection process to be truly random. The "iloveyousomuch" example by Steube is unlikely to be picked randomly.

salmonellaeater is right, Steube misunderstands the comic. The idea of the comic is to pick a small random selection of the 250,000 distinct words in a oxford dictionary, rather than 8 of the 95 letters from all ASCII printable characters. A selection of 3 words has then higher entropy than 8 random characters, because 250,000^3 is a bigger number than 95^8. The question then is, will 3 random words really be easier to remember than 8 ASCII printable characters?

The downside to the Schneier scheme, is that each is a common sentence (low entropy), with a chosen transformation algorithm added. Thus the quality of the password will depend on the number of transformation algorithms, and the quality of each one. If we are to use the one first described to create "tlpWENT2m", we get a password strength like:

Using strictly the first letter, would only do 2x linear increase in entropy over just searching for common sentences. Change any occurrence of common numbers substitutes for words adds (0-2x) entropy increase. Writing one of the words in all caps means 6x increase in entropy. Combined, tlpWENT2m is slightly less secure than "This little piggy went to market" + two [random number below 10] or a single letter at the end.

Where are you guys getting this? All I read was this:

> Steube was able to crack "momof3g8kids" because he had "momof3g" in his 111 million dict and "8kids" in a smaller dict.

> "The combinator attack got it! It's cool," he said. Then referring to the oft-cited xkcd comic, he added: "This is an answer to the batteryhorsestaple thing."

It sounds to me like he's combining words randomly, not "exploiting common human behavior".

He found a password by 2 words randomly from two dictionaries of different sizes, so he only had m * n combinations to choose from, and his n is a lot smaller than m.

Whereas the xkcd approach is more like m * m * m * m.

In other words, exponentiation > multiplication.

Correct. What I meant with "exploiting common human behavior", is that the dictionaries the attacker used is built from list of old passwords found in previous attacks. Those dictionary will be order of magnitude smaller than a dictionary of the English language, but attackers know that people tend to pick passwords (or in this case, compilation of passwords) that someone else has already thought of before. Its a simple observed behavior that people in general tend to think alike, and simply do not think randomly even if individually, it "feels" random.
> The question then is, will 3 random words really be easier to remember than 8 ASCII printable characters?

In a sense, yes. The xkcd comic also illustrates this. A common technique to remember a sequence of arbitrary things is to transform the things into concepts or objects, and transform this sequence into a ridiculous story or visual image (the crazier it is, the better it sticks in the mind, plus it's more fun).

If you use words instead of random characters, you get to skip the "transform into concepts or objects" step, as well as you don't need to string as much of them together in a crazy but coherent picture/story.

Of course it's important to build the picture after the words, not the other way around, because then you'd probably lose some entropy again.

The entropy Randall calculated for "correcthorsestaplebattery" was a lower bound, meaning that if the attacker knows that you made your password out of 4 dictionary words, it still has tons of entropy. If the attacker doesn't know how you came up with your password, it'll take them even longer.
How would something attack Diceware?

There's a list of 7776 words, everyone knows what words are on the list. I suspect that sometimes people re-roll because they don't like a word or don't think they'll remember it. But I don't that that makes much difference.

> I suspect that sometimes people re-roll because they don't like a word or don't think they'll remember it.

This is strongly discouraged, and it does matter. The words are supposed to be random, but re-rolling makes them not random.

What password length would you need to get away with a plain-old grammatical english sentence (i.e. very much non-random selection)?

For example: "and in the swept plains of winter's vale, our hero did beseech the emperor to send for his forces" -- what would be the difficulty in cracking that, given that this isn't a quote from a book or anything, but just a sentence that popped into my mind and seems easy enough to remember?

Almost 20 years ago I saw a great password-picking article that still holds today. http://world.std.com/~reinhold/diceware.html

Take a list of 6^5 words. Roll 5 dice. Take that word from the list. Do this 4 more times. You now have a five-word passphrase like "moire fraud 80 row bernet".

Even if someone knew the exact method and list you did to get that passphrase, there are 28430288029929701376 combinations, giving you over 64 bits of entropy.

Someone has probably tried to rainbow table all those results for MD5. If a core can do 1 billion hashes per second, it would take 900 core-years to build a complete list of all those combinations, which is probably feasible for a small group to put together, but messing with the list just a little bit or adding a 6th word would likely put you past that even for a crappy MD5 hashing.

Shannon did an experiment that found the entropy of English text is about 1.6 bits per character. This is probably a high estimate, since the kinds of sentences you might think up for a password probably have lower entropy than if you used a source of random bits to generate valid sentences.
My God, are you going to type all of that or will you need a script to do it for you. Watch out for those touch-screen thingies people are touting around.
http://keepass.info/

Some things you don't always need to use from those touch-screen thingies

That's a funny choice for the name. Is it kee-pass or keep-* ?
With Swype and similar programs, passphrases are pretty easy to enter.
I know there are tools & password vaults but what %-age uses them? Secondly, those password managers are introducing another possible vulnerability where you don't have control.
Swype is a text-entry interface.