Hacker News new | ask | show | jobs
by dllthomas 4735 days ago
Do not give password advice without looking at the entropy.

Estimates of the entropy of English text place it below 1.5 bits per character. "The brown cow jumps over the moon." would, generously, have about 34 * 1.5 = 51 bits of entropy, plus a few more for the simplistic substitutions - say 70 bits total? This is assuming the sentence was, in fact, chosen uniformly across English sentences, which is obviously not going to be the case (this one being a modification of a line from a nursery rhyme), so in actuality it'll be even worse.

A fully random password of length 20, from characters on a typical keyboard (say 94, it seems to be on mine) would have 20 * lg(94) > 20 * 6.5 = 130 bits. But impossible to remember and a pain to type correctly.

Picking from my /usr/share/dict/words with no restrictions (99171 entries), it would take 70 / lg(99171) = 5 words to be stronger than the sentence and 130 / lg(99171) = 8 words to be stronger than the gibberish, with no substitutions or tweaks, however not all of those passwords could be typed on my keyboard.

Restricting /usr/share/dict/words to those which match (with LANG=C) '^[a-zA-Z]\{1,10\}$' yields 61078 words at about 7.3 bits of entropy per word, so you would get security comparable to the above with 5 (again - aliasing) and 9 words respectively.

Some nine-word passwords generated this way:

    embryo distressed Ramadan chocks broaching official outstript explicit formulas
    tokens bruskly realizing rubric earmarks aphorism sweeps hallelujah Bardeen
    respects jocularity crummier leave spinsters Rodriquez hatch assurance torture
    patinas Elba dairymaids blabbing kissing handyman Ind tobogganed directed
    mossy Flora concepts medalist kidding heinously deafened evaluation nodes
    Steinmetz lizard Janette scatted cunning geckos belched demurring grandest
    faints nicest unleashes navel Monroe frostbites Pl loon careening
    overtake tasselled quahog utters Upjohn incloses punchy Jericho reveille
    sicked sinning premiere Satanism loiters accrual Caspar infatuate renewable
    dinning hereabouts Lithuanian formalism voiceless demoted bundle teed fluent
The above were generated with LANG=C grep "^[a-zA-Z]\{1,10\}$" /usr/share/dict/words | rl --reselect -c 10 | xargs

This is, obviously, reliant on an assumption that rl produces cryptographic level randomness, which is probably not the case but should certainly be near enough the case for examples (and in any case will be much, much closer to true than any method involving humans - we are very poor sources of cryptographic entropy).

2 comments

There is also the excellent Diceware: http://world.std.com/~reinhold/diceware.html
I'm a fan of Diceware. Strong entropy guarantees and memorable passwords.
Thanks for taking 1 tiny part of my point and trying to destroy it.

I chose the passphrase "The Brown cow jumps over the m00n!" as an example not "The brown cow jumps over the moon." which is a significantly worse passphrase, especially considering every word is available in a dictionary.

The OP had trouble memorizing more than 16 characters for a passphrase so I suggested something easier yet still solid yet you seemed to think I suggested just a plain english sentence of words.

If you think "m00n" vs. "moon" or "The" vs. "the" matters, you're not paying attention. A memorizable, but randomly composed string of words all in lower-case ASCII is significantly stronger than anything "complex" (for you, hardly for the cracker--common substitutions are basically worthless: they provide no entropy) you can concoct and remember.
Again, you're not looking at the entropy.

Anything you can generate without much more unpredictability from a plain English sentence is not a significantly better passphrase than a plain English sentence. Better? Yes. And I credited you for that.