Hacker News new | ask | show | jobs
by one_off_comment 1608 days ago
Tangential thought: if you're using a passphrase you're not going to ever type manually, for example something you're going to generate once and stick in a secret management system, why not build the passphrase using all possible UTF-8 characters as your corpus? Seems like restricting yourself to ASCII characters is just giving an advantage to those attempting to brute force the passphrase.
4 comments

> why not build the passphrase using all possible UTF-8 characters as your corpus? Seems like restricting yourself to ASCII characters is just giving an advantage to those attempting to brute force the passphrase.

Restricting yourself to ascii means you don't need to worry about text encoding. Who knows when you end up needing to paste it, or when something decides to be helpful and messes up the encodings.

This doesn't make much sense to me. The point of a passphrase is to be readable/writeable by a human. If you don't need that, you just want a binary key (which can be base64 encoded/decoded to be read/written by a human).

Using all utf-8 characters seems like it combines the downsides of both of these (not really human readable/writeable but also not using the full key space).

I'd be happy with just 32 bytes of random alphanumeric ascii, it doesn't really need improvement. If that gives a too big advantage, then use more.
Because all possible UTF-8 characters give you passphrases that are hard to write down, hard to transcribe from paper onto keyboard or vice/versa, hard to repeat aloud, and hard to recognize visually.

The tradeoff for using a smaller character set is longer passphrases (for shorter character sets) versus "less humane" passphrases, for a given level of target entropy.