Hacker News new | ask | show | jobs
by mlyle 1587 days ago
Is it?

6 guesses and I have 14 hex digits (56 bits) of the hash, along with knowing the population counts for all the numbers. This is enough to run a password cracker and determine the plaintext if it's a readily guessed password.

Sure, it breaks conventional use of rainbow tables, etc, but...

edit: Eh, 14 characters. OK, that's pretty resistant to anything other than debugging.

2 comments

How does that help you when any of your inputs' digest is not related to any other's, not even knowing the target length of the original message? what am i missing?
The correct password is impossible to calculate from the given data, but it seems like it should be possible to check whether a password matches the data.
Yeah because the algo is known, it is SHA256.

The thing is you don't know the length of the password. It could be more than the number of hydrogen atoms in the universe, or 12. You still have to brute force or look up one possible solution (or collision thereof).

The whole thing just shows that a hash makes ZERO applicable inferable assertions about the message (password).

Thats the definition of evenly distributed hashing functions: change anything in the message, including length, and there will be no identifiable relation between the hashes of one messsage and the next you try,

I think for something this checking the source for the generation algorithm is fair game. here it is:

  function randomInt(n) {
    return Math.floor(Math.random() * n);
  }

  function randomPassword() {
    let letters = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
    let digits = '0123456789';
    let punctuation = '!"#$%&\'()\*+,-./:;<=>?@[\\]^_`{|}~';
    let s = letters.repeat(7) + digits.repeat(4) + punctuation.repeat(3);
    let length = 14;
    let res = Array.from({length}, (() => 
      s[randomInt(s.length)])).join('');
    return res;
  }
looks like it's 14 characters long, and each character has an independent 72.8% / 8% / 19.2% chance of being a random letter / digit / punctuation. There are 94 symbols total, so 94^14 possible solutions; roughly 92 bits of entropy. Even if you assume 10 letters, 1 digit, 3 punctuations (the "likely" distribution) it's still 75 bits of entropy. You might be able to gain an advantage through knowledge of the PRNG state, but the PRNG in v8 (xorshift128+) has a period of 2^128 - 1.

So not great odds...

92 bits of entropy, and the first guess peels off about 14 bits of it. Subsequent guesses a little less.

The annoying thing is, you still have to search that whole space to find the password.

But after 9 guesses, you can solve offline for the character string... it's just very expensive.

How does the first guess "peel off" 14 bits of entropy?
Could you do it with a rainbow table?
> The thing is you don't know the length of the password. It could be more than the number of hydrogen atoms in the universe, or 12.

I'll take 12 then.

Well we know it has to fit in a string data type. And there’s only soooo much ram available to a JavaScript variable.
Not necessarily. The hash could have been generated with something other than javascript.

In fact because functions like sha256 are iterative it's possible to hash a password which is longer than the RAM in a system. Technically possible to hash a password which is longer than storage in a system too, if you don't care about storing the password.

I mean, anything past 256 bits is going to have a collision, so that doesn't matter, but you're right that the entire point of a hash is that even if you know the hash, it's very very hard to find what the plaintext is.
There are a number of reversible hash algos. The point of hash is that the small changes in the input produce big changes in the output so even a 1-bit change to the input produces a completely different output. Some hash algos having trap door functionality is really more of a bonus.
You can only reverse your hash function, if you output is at least as long as your input.

The kind of function you describe is useful, too, of course. You can build something like them out of almost any modern encryption method:

Encryption methods have to be reversible, so you can decrypt; and they are expected not to betray anything about their inputs, so there are probably some that have this avalanche property, or can be patched to have it fairly simply.

Sure, I guess I wasn't being specific enough. Once of the reasons to use SHA-256 is because that's hard to do.
It's true that any input length larger than 256 bits will exhibit a collision. It isn't true that it will necessarily exhibit every possible output. Maybe there's an output value that is only available for ridiculously large input.
Yes, that's possible in general. Though fairly unlikely, if the hash was 'random'.

We know the structure of SHA256, so we could actually answer that question.

https://en.wikipedia.org/wiki/Preimage_attack says that pre-image attacks on hash function in general only take 2^n time (ie you don't need to look for passwords longer than 256 bits), but I don't see how they conclude that.

> It could be more than the number of hydrogen atoms in the universe

Not very likely, since the OP wouldn’t be able to hash it. Or he’s secretly demonstrating something much more awesome than Passwordle.

Who said they hashed it? The correct answer is a 256-bit value, and you're trying to guess any string that hashes to that value. Nothing requires that OP generated that value by hashing a string though...
Fair point, I’m just hoping that the author starts from something that could actually be input as an answer (and therefore hashes it at least once).
> Not very likely, since the OP wouldn’t be able to hash it.

Not necessarily. OP might have found the answer with a mathematical short-cut.

To give a really silly example: suppose my hash function just returns the length of the input string. (That's what PHP used to do for hashing at some point.)

I could tell you what my hash of a really big number is, without needing to be able to write that number down. And no shorter number would have the same hash.

SHA256 might have a similar exploit. (Though as you say finding such a shortcut in SHA256 would be much more awesome than Passwordle.)

As a proponent for advancement, I will hope for the latter while laughing at your comment.
Why not? You don’t need enough resources, just lazy seq and time!
> It could be more than the number of hydrogen atoms in the universe, or 12.

Doesn't matter. You don't really have to look at passwords longer than 256 bits, because above that you'll have guaranteed collisions.

(The exact math is a bit more complicated, because there might be so many collisions in the first 256 bits, that there are strings longer than 256 bits that produce hashes that haven't been hit before.

But the order of magnitude of 256 bits is about right.)

"The thing is you don't know the length of the password. It could be more than the number of hydrogen atoms in the universe, or 12."

You don't have to know the length though, just the length of potential collisions(so between 1 and whatever max length is the hash)

i don't think anyone has created a hash collision in SHA-256 yet (meaning, given a hash, create an input that generates the hash)
yeah but who cares? it's still 100% possible, you just need a lot of time
The length of the password is 14.

  function randomPassword() {
    let letters = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
    let digits = '0123456789';
    let punctuation = '!"#$%&\'()\*+,-./:;<=>?@[\\]^_`{|}~';
    let s = letters.repeat(7) + digits.repeat(4) + punctuation.repeat(3);
    let length = 14;
    let res = Array.from({length}, (() => s[randomInt(s.length)])).join('');
    debugger; // どうぞ
    return res;
  }
[ @some-sober-math-guy insert probabilities ]
This isn’t how Sha works
If they were dictionary words, or a similarly constrained search space, fed through SHA, this is exactly how it works. The information given after one guess excludes a whole lot of guesses as being possible solutions.

Here, there's 91.7 bits of entropy in what goes into the hash function. Each guess shaves off more than 10 bits of entropy. After 9 guesses, only one password conforming to the generation format will be possible... yes, it will be very (impractically) hard to find this password, but the rest could be done offline to find the 10th value and solve it in 10 guesses.

e.g. Make 9 random guesses.

Then, for each of the 2^92 possible input strings:

1. Hash it.

2. See if the hash matches the things we know about the hash from the previous guesses.

Right, it's a random password tho not a dictionary word
Even so, I just edited my comment and elaborated.

You can do this in 9 online guesses with feedback + a very large number of offline guesses, and have the solution for the 10th.

The information is there-- just the best search strategies known are very expensive.

> a very large number of offline guesses

Right, the entire search space of random passwords.

The matching hash characters are tongue-in-cheek. They don't help you. They could've just given you the entire hash up front and you would still have to search the entire random password space. Sure, you could do it "offline", but it would still take forever to compute

This is the best description of why it's completely infeasible to make a system to guess it.

It would be only be possible if the password length was below a certain threshold (maybe 30 characters) beyond that limit, there wouldn't be enough atoms in the known universe in order to store every hash/password combination.... making it physically impossible....

Sure it is, dictionary attacks are extremely efficient, if the password is made out of dictionary words.