Hacker News new | ask | show | jobs
by edflsafoiewq 1519 days ago
Padding without the length isn't suffix-free, ie. there's two different messages x and y with pad(x) a suffix of pad(y). You want that basically because if there's ever a collision part-way through the loop on two inputs, if there's a common suffix, the rest of the loop will be the same so there's no chance to "escape" the collision.

Being some kind of artifact seems plausible.

1 comments

> Padding without the length isn't suffix-free

Can you give a more concrete example? Specifically with padding with 1 then all 0s without length. Appending the 1 then all 0s is supposed to prevent collisions.

The length suffix is used as an early attempt to avoid length extension attacks on MACs of the form H(secret|M). However, we've later seen that it's not sufficient as it's easy to determine length of the secret by trial and error. This eventually led to the creation of HMAC H(H(secret^opad)|H(secret^ipad|M)).

In theory, the length suffix is no longer needed (or the "1" suffix but we save more space by removing the length). Maybe a cryptographer with more history knowledge can explain this but personally I think it's now one of those "don't fix what's not broken" things. It doesn't hurt security and it's already been thoroughly analyzed (and hardware optimized) so we just leave it.

Of it not being suffix-free? Just stick a full block on the front of the message.

Length padding isn't just for MACs, its used in the first place to prove Merkle-Damgard works at all, ie. is collision-preserving. If you have a hash collision with two messages with the same length, you can run through the Merkle-Damgard loop in parallel and find a collision in the compression function. For different lengths that doesn't work but if you just insert the length into the last block, you know you'll always have a collision in the final block in that case.

There's a proof here that being suffix-free is necessary and sufficient for a padding rule to make Merkle-Damgard collision-preserving: https://eprint.iacr.org/2009/325.pdf

> stick a full block on the front of the message

Yes, this will give you the same intermediate state in the hash function but if m1!=m2, then the final output H(pad'(x|m1)) != H(pad'(x|m2)) where pad' is just appending 1 then all 0s.

Maybe it's due to the years since my crypto class or maybe I'm just dense but I'm not following your logic at all. I'm asking for an actual example of m1 and m2 that will result in the same hash if they are not padded with their lengths.

I made the assertion that for all m1,m2 if m1!=m2 then H(pad'(m1))!=H(pad'(m2)). Disproving it is simply giving a counter example m1,m2.

EDIT: I took some time to skim through the paper you've linked and it supports my assertion:

> We also have shown that the simplest padding such as padding 10^d only can be sufficient for collision preserving property if we restrict collision resistant assumption of the underlying compression function for the first (t − 1) bits.

Since we know (or at least believe) the SHA2 compression function is collision resistant, then length padding is redundant.

Oh, we're talking about different things. I'm only talking about proving collision-preservation, not something as spectacular as generating collisions.