Hacker News new | ask | show | jobs
by icapybara 739 days ago
Gonna have to explain how a “high entropy line” is calculated and why it might be secrets.
2 comments

Entropy of information is basically how well it can be compressed. Random noise usually doesn't compress much at all and thus has high entropy, whereas written natural language can usually be compressed quite a bit. Since many passwords and tokens will be randomly generated or at least nonsense, looking for high entropy might pick up on them.

This package seems to be measuring entropy by counting the occurrences of each character in each line, and ranking lines with a high proportion of repeated characters as having low entropy. I don't know how closely this corresponds with the precise definition. Source: https://github.com/EwenQuim/entropy/blob/f7543efe130cfbb5f0a...

More: https://en.wikipedia.org/wiki/Entropy_(information_theory)

Of course, this heuristic fails for weak passwords.

And it fails for passphrases like 'correct battery horse staple', which have a large enough total entropy to be good passwords, but have a low entropy per character.

4 diceware words is hardly a good password. It's ~51 bits of entropy, about the same as 8 random ascii symbols. It could be trivially cracked in less than an hour. Your average variable name assigned to the result of an object name with a method name called with a couple parameter names has much more entropy.
If you can crack a single 52bit password in an hour, that's suggesting you can crack a 40bit password every second. That's 1 trillion hashes per second.
350B H/s was achieved in 2012 on consumer hardware. That's over 12 years ago, and several lifetimes of GPU improvements ago. 4 diceware words is simply not appropriate for anything remotely confidential, and it is bad for the community to pretend otherwise.

https://theworld.com/~reinhold/dicewarefaq.html

If you read the sources, that's 350B _sha1_ hashes per second... While you can't be sure what hash system is being used for your passwords, any respectable system using a modern password hash is not even close to being that fast. OWASP's recommended 600000 rounds of pbkdf2 performs 1.2 million sha2 block rounds IIRC. If we assume that sha1 and sha2 are equivalent in performance, then you're looking at only 290,000 password attempts a second.

If the password system uses argon2 with a high memory requirement, you're in an even better position

Salts and timeouts made that password cracking technique obsolete anyways.
Only for online access. Offline access is still a thing, and in no way "obsolete".
So you do random capital words, random punctuation and add a number somewhere and you’re at 60. Add more for whatever threat model you’re trying to be secure against.

https://beta.xkpasswd.net/

The random punctuation sort-of defeats the point, doesn't it?

Otherwise, I agree.

Not sure; you can use the same character instead of a space and still get a few bits. Of course different ones would be better, but again, depends on how many bits you actually need.
Just imagine my example used 8 words.
But it didn't. It perpetuated the exceedingly common myth that 52 bits is somehow enough. This has been considered bad practice for well over a decade now. https://theworld.com/~reinhold/dicewarefaq.html