Hacker News new | ask | show | jobs
by DeathArrow 1700 days ago
Useless analysys since the author says it's not a cryptographic hash but useful as a fast hash for change detection.

"we wanted a fast, non-cryptographic hash for use in change detection and deduplication"

>A great alternative is XXH3

Meow Hash is twice as fast.

6 comments

If you had made it one section into the analysis, you would have seen that at the time MeowHash made certain cryptographic claims that the author set out to disprove.

The readme has since been updated. I didn't check whether any algorithmic changes were made on top, but the discussion of the analysis on github didn't point to a lot of low-hanging fruit.

It's not useless analysis, because even for non-cryptographic hashes you want the likelihood of any arbitrary hash to be roughly equal. A hash function which "prefers" certain outputs has a far higher probability of collision.
Don't you think asset planting is an attack against a game's pipeline?

The author of the article's page claims the hash is not cryptographic but actually goes on to make security claims about the hash. People who do not understand cryptography should be careful about making such claims. The author appear to understand this more than your comment demonstrates.

For example, a claim about change detection is a cryptographic claim of detecting preimage attacks. In a threat model, a security professional would determine whether a first preimage or a second preimage attack is what should be guarded in attack scenarios. Then, the professional would help with analysis, determining mitigations, defense in depth, and prioritization of fixing the vulnerabilities exposed by how the hash is used.

A hash cannot be considered standalone. It is the architecture and use-case where the hash's security properties are used to determine what security properties of the application are fulfilled.

So, if the author is correct, which seems to be the case, then meowhash should not be used in a production environment outside of the most simplistic checks. It seems faster for its intended use case to simply check for a single bit difference between two images - no hash required.

> Don't you think asset planting is an attack against a game's pipeline?

Realistically? No. It is one of those hypothetical scenarios where if an attacker can plant files into your file system, you have bigger problems.

https://devblogs.microsoft.com/oldnewthing/20060508-22/?p=31...

Realistically hashes are used during the development of a game to detect when a file or asset hash changed, and therefore it will trigger regeneration of assets that depend on it. For some long or slow build steps it is better to rely on a hash changing than the timestamp changing to trigger a build. It can also be used to fetch pre-built data from where it's cached on a server.

That's why it should be fast and doesn't have to be cryptographic.

Regarding security if you're at the point of someone malicious creating a file on your internal systems you've already lost the battle.

> it seems faster for its intended use...

But then you have to store the entire before & after locally? That's the entire point of using a hash for change detection.

> But then you have to store the entire before & after locally?

Yes, there is difference between two (as you say) and there is integrity (modification detection). In the case of comparing new assets in a pipeline to those that were created earlier, it sounds plausible both copies would be present.

> That's the entire point of using a hash for change detection.

This is called integrity protection. Change detection is the incorrect term to use here. Please see what I referenced earlier for first and second preimage.

What determines whether a hash is "cryptographic"? What would make it suitable for change-detection but not be "cryptographic"? Is the claim here that it would not be suitable for detecting "malicious" changes, but is still suitable for detecting "natural" changes?
A couple features are that is hard for an attacker to find a collision, and it is hard to reverse the original data from the hash. Both can cause serious security problems but aren't necessary in something like this where most image changes are "natural" and you gain speed by weakening those constraints.
Saying it's twice as fast is rather misleading? They can both hash as fast as RAM speed allows anyway. And if it's something in cache I doubt one is significantly better than the other.
Meow hash is also written by people who initially thought SHA-1 was acceptable for large scale change hashing.