Hacker News new | ask | show | jobs
by e12e 3448 days ago
I've been thinking about this for a while, and the early conclusion I've come to, is that 64bits of provable random entropy in a password that's also memorable is a very high bar to clear.

Imagine this, you take four word types/groups, say, substantive, verb, adverb, preposition/place.

You list 128 of each - all with identified uniqly by the first two letters. You let a machine pick a word from each column at random. The phrase is your mnemonic key, the password (to type in) is the first two letters of each word, concatenated.

If you want to appease password strength checks, capitalise the first letter, and end the input with a period.

So: "girl runs happily up", becomes "giruhaup" (or, with equivalent entropy, but satisfying "at least three symbol groups": "Giruhaup.").

Now, that's then 4 picks out of 128 words, or an encoding of 4 times 7 bits (2^7=128) - 28 bits. You'd need three such passwords concatenated to break past 64 bits of entropy. And you'd have to type in 24 letters. That's pretty hard to type in blind without a typo.

You might be able to use lists of 256 words - but it'd make it a bit more difficult to make the wordlists (because words should be identified by the first two characters) - and you'd still need two "phrases" and type in 16 characters.

Adding random numbers, symbols or capitalization is probably not worth the challenge they add in remembering where they go, for the single/few bits of entropy they add.

And I'm still not convinced 16 characters is short enough to be usable for "most people".

2 comments

Rather than rolling your own password system, I would recommend diceware.com for strong passwords (including master passwords) that you can memorize (I am bad at memorization, and have memorized 129 bit passwords this way, and 64 bit passwords are kind of a breeze to memorize).

For the long tail of passwords that you shouldn't be memorizing in the first place, a password manager with a good configurable password generator is invaluable. I use Lastpass (I like the breadth of it's platform support: all major consumer OSes, all major mobile OSes, extensions for all major browsers). Alternatively, lot of people recommend 1password.

Diceware has better guarantees, but the password managers are usually much more convenient[1]. I weigh these costs and benefits when choosing which way to go for a particular use case.

[1] With the significant exception of passwords that will regularly have to be typed out on mobile, since diceware passwords are much more virtual keyboard friendly than random character generated passwords. This is partly because you can typically keep the entire thing in your head, not having to reference your password manager multiple times, and partly because they don't rely on special characters for their entropy, so can be typed out on the primary keyboard without switching to numeral or special character keyboards.

The reason I've been thinking about this, is that I'm not happy with diceware. Five words (64 bits of "guaranteed" entropy) is around 20 characters - and I'm not sure if diceware looses some entropy if you omit spaces (eg: "at hat" and "a that" both become "athat").

My main takeaway looking at the problem, is that 64 bits is a lot to encode in ~26 letters and maybe 10 digits - in a way that is easy to remember, easy to type, easy to read (if eg: given a printed initial password, read/hear (sharing over the phone/double as a way to read out a hash/shared key etc).

My main issue with diceware is the large number of words; almost touching on typical active vocabulary of even native speakers - never mind if your users speak little or no English. One benefit of the system above is that as long as you can come up with four/five sets of 128 words that don't collide among themselves in the groups of 128 - you can adapt the system to any alphabet and preserve any guarantees of entropy. Making a diceware wordlists is a huge undertaking by comparison. (But the benefit is that people have already done this for many languages).

Why truncate the words down to the first two letters? Are you reinventing xkcd 936?
For ease of typing. Typically you have to enter passwords verbatim - typically passwords need to be entered without error, blindly. 16 characters is easier to get right that 60.

And while it might feel good to pretend full words add entropy, if you assume the attacker knows your system - it really doesn't (hence "guaranteed" entropy).

As for diceware, I don't find those passwords easy to remember - especially past 60 bits of entropy. But use what works for you.

> And while it might feel good to pretend full words add entropy, if you assume the attacker knows your system - it really doesn't (hence "guaranteed" entropy).

It does: munroe's proposed scheme operates on the assumption the attacker knows it. The 11 bits of entropy refer to a dictionary of 2K words to choose from. The reason to type full ones is you're not hamstrung by the "no common prefix" limitation, which allows larger (and easier to remember) dictionaries.

Also, we're talking theory. Typing them blindly is an artificial implementation limitation imposed on us by bad software. Just like "you need at least one digit", "maximum length 16", &c. If you're going to consider those, that's fine, but then you're not talking about actual password theory anymore--you're just discussing how to cope with bad platforms.

Case in point: many good PW forms (OS logins, &c) have no such limitations, and offer a "view password while typing" option.

But there's a reason for hiding password input: [ed: making shoulder surfing a little harder]. Or unlocking a computer that's projecting to an audience. [ed: see also citizenfour where Snowden uses a blanket when typing in a pass phrase].

This is indeed not about password "theory", because experience shows that actual system (in)security happens where computer systems and users interact.

Using a common subset of keyboard layouts for different languages (limiting the character set), being workable on touch screens, are important for security. And using passwords at all is working around "bad platforms".

> The 11 bits of entropy refer to a dictionary of 2K words to choose from. The reason to type full ones is you're not hamstrung by the "no common prefix" limitation, which allows larger (and easier to remember) dictionaries.

From playing with this, I'm not convinced the tradeoff of using a big dictionary whose that cannot be enumerated by a short unique prefix (to reduce length) really adds that much - just like increasing the character set beyond 26/36 helps all that much - because you only gain a bit for every doubling in size.

My idea is for the mnemonic to form an actual "story" (in a secure way) - in the hope that it's easier to remember : "boy flies angrily away" than "correct horse battery stapple".

A) that may be wrong

B) You still need too many words in order to encode a "high enough" entropy

> The 11 bits of entropy refer to a dictionary of 2K words to choose from. The reason to type full ones is you're not hamstrung by the "no common prefix" limitation, which allows larger (and easier to remember) dictionaries.

Another note on this - assume an average word length of 5 - that's 11/5 or 2.5 bits per character typed (again, assuming the wordlist doesn't loose some bits for "double coding" like "at hat/a that").

At 7 bits per word - of which two characters are enough, we type 7/2 or 3.5 bits per character.

Conversely, we only memorize 7 bits per word vs 11 bits.