Hacker News new | ask | show | jobs
by hannob 898 days ago
> The article mentions the key detail: MD5 is broken for cryptography (collisions) but not for second preimage attacks.

The problem with this argument is that people often don't properly understanding the security requirements of systems. I can't count the number of times I've seen people say "md5 is fine for use case xyz" where in some counterintuitive way it wasn't fine.

And tbh, I don't understand the urge of people to defend broken hash functions. Just use a safe one, even if you think you "don't need it". It doesn't have any downsides to choose a secure hash function, and it's far easier to do that than to actually show that you "don't need it" (instead of just having a feeling you don't need it).

For the unlikely event that you think that the performance matters (which is unlikely, as cryptographic hash functions are so fast that it's really hard to build anything where the diff. between md5 and sha256 matters), even that's covered: blake3 is faster than md5.

3 comments

> I can't count the number of times I've seen people say "md5 is fine for use case xyz" where in some counterintuitive way it wasn't fine.

I can count many more times that people told me that md5 was "broken" for file verification when, in fact, it never has been.

My main gripe with the article is that it portrays the entire legal profession as "backwards" and "deeply negligent" when they're not actually doing anything unsafe -- or even likely to be unsafe. And "tech" apparently knows better. Much of tech, it would seem, has no idea about the use cases and why one might be safe or not. They just know something's "broken" -- so, clearly, we should update immediately or risk... something.

> Just use a safe one, even if you think you "don't need it".

Here's me switching 5,700 or so hashes from md5 to sha256 in 2019: https://github.com/spack/spack/pull/13185

Did I need it? No. Am I "compliant"? Yes.

Really, though, the main tangible benefit was that it saved me having to respond to questions and uninformed criticism from people unnecessarily worried about md5 checksums.

>And "tech" apparently knows better.

The tech community has a massive problem with Dunning-Kruger, and has for basically ever. Hell two decades ago when I was a young guy working in the field so did I.

I'm not sure if its because the field is basically a young man's game and that's inherent with relative youth, or if there's something deeper going on, but its hard to ignore once you notice it.

That said, the idea that you have a better handle of what's going on in the legal system and the needs/uses legal professionals have then actual people in the legal profession and academics in the legal field is a pretty big leap even with those priors.

> I can't count the number of times I've seen people say "md5 is fine for use case xyz" where in some counterintuitive way it wasn't fine.

Help us out by describing a time when this happened. MD5's weaknesses are easily described, and importantly, it is still (second) preimage resistant.

I agree that upgrade is likely your best bet. But I've found the other direction of bad reasoning is a more pernicious trap to fall into. "My system uses bcrypt somewhere so therefore it is secure" and the like is often used as a full substitute for thinking about the entirety of the system.

> MD5's weaknesses are easily described, and importantly, it is still (second) preimage resistant

Most devs have no idea what that means, but most devs still need to use hash functions. They need to use primitives that match their mental model of a hash function. Said model is https://en.m.wikipedia.org/wiki/Random_oracle

The usual answer here is "don't roll your own crypto", but in practice abstinence-only cryptography education doesn't work.

> Help us out by describing a time when this happened.

Linus Torvalds saying that SHA-1 is okay for git, while it is used for Git signatures as well. Signatures are a classic "you need collission resistance to have safe signatures, but people are often confused about it" case.

I might be mistaken, but wouldn't a git signature already be signing trusted things (i.e. the person making the original signature is trusted), making any attack enabled by the input hash function a second preimage attack (i.e. an attacker onky knows the trusted input, not anything private like the signing key)?

Hash collisions mean you can't trust signatures from _untrusted_ sources, but git signatures don't seem to fit that situation.

As you pointed out, signatures make content trusted, but only to the degree of the algorithm's attack resistance. I think it's also important to define trust; for our purposes this means: authenticity (the signer deliberately signed the input) and integrity (the input wasn't tampered with).

If an algorithm is collision resistant a signature guarantees both authenticity and integrity. If it's just second preimage resistant, signing may only guarantee authenticity.

Now, the issue with Git using SHA-1 is that an attacker may submit a new patch to a project, rather than attack an existing commit. In that case they are in control of both halves of the collision, and they just need for the benign half to be useful enough to get merged.

Any future commits with the file untouched would allow the attacker to swap it for their malicious half, while claiming integrity thanks to the maintainers' signatures. They could do this by either breaching the official server or setting up a mirror.

One interesting thing to note though: in the case of human readable input such as source code, this attack breaks down as soon as you verify the repo contents. Therefore it's only feasible longer term when using binary or obfuscated formats.

> And tbh, I don't understand the urge of people to defend broken hash functions. Just use a safe one, even if you think you "don't need it".

The ideal discourse would not imply a binary sense of "safety" at all, much less for a function evaluated outside the context and needs of its usage....

The thing is: We have a binary definition of safety for cryptographic hash functions, and it works well.

You can add a non-binary sense of safety to cryptographic hash functions, but it makes stuff a lot more complicated for no good reason. If you use the "preimage-safe-but-not-collission-safe" ones, you need to do a lot more analysis to show safety of your whole construction. You could do that, but it gives you no advantage.