Hacker News new | ask | show | jobs
by cm2187 4023 days ago
I don't get why it is a security problem that someone can manufacture false positives for an anti-virus. What is the benefit for a virus to have non-malicious code caught by the anti-virus?

False negatives would be more of an issue if the anti-virus has white lists and one can manufacture a Microsoft Excel MD5 signature with a malware. But that's not what the article refers to.

MD5 is only broken if you want to use it as a non-reversible hashing algorithm or if you want to use it as a an unforgeable signature. But it's perfectly fine for many other usage.

3 comments

From the article:

  As you can see, binaries submitted for analysis are
  identified by their MD5 sums and no sandboxed execution is
  recorded if there is a duplicate (thus the shorter time
  delay). This means that if I can create two files with the
  same MD5 sum – one that behaves in a malicious way while the
  other doesn’t – I can “poison” the database of the product
  so that it won’t even try to analyze the malicious sample!
So it's a technique to get the scanner to ignore a malicious binary by constructing a non-malicious one with the same MD5 sum. This would be much harder if the scanner used a SHA-1 hash or similar.
But that's a white list. But I thought anti-virus rather work by black listing.
virustotal.com allows you to upload files to scan with a whole range of anti-virus programs. Before uploading, it will calculate the hash of your file client-side to see if the file should be uploaded or if a previously uploaded (by someone else) file with same hash should be re-scanned with newer versions of the anti-virus.

I don't know which hashing algorithm they use but just as example of a situation where whitelist is not used.

Yes, I think that's what the author was alluding to here, although I'm not sure:

  The approach may work with traditional AV software too as
  many of these also use fingerprinting (not necessarily MD5)
  to avoid wasting resources on scanning the same files over
  and over (although the RC4 encryption results in VT 0/57
  anyway…).
sha256sum or b2sum (BLAKE2b) would be far better than sha1 :)
You misunderstand. The researchers are presenting a way to manufacture false negatives for an anti-virus. It works by confusing antivirus vendors' infrastructure into thinking it's already analyzed an executable and found it to be innocent when it's really analyzed something else.
The attack vector would be malware binary crafted to have the same MD5 sig as a popular already trusted app. But of course once the badware is caught virus scanners could check other properties aside from MD5 sig to flag a bad binary. I assume virus scanners use MD5 just a fast prescreen scan, then do a few deeper checks on pototentially bad binaries to make sure.
What you describe there would be a preimage attack[0], not a collision attack. There is no publicly known practical[1] preimage attack on MD5 at this time.

0. http://en.wikipedia.org/wiki/Preimage_attack

1. 2^123.4 complexity is not practical

If I understand correctly, what's described in the article indeed uses a collision attack. The "trusted app" is one the hacker manufactures himself.
Yes, that was what the article says, but the person I was responding to misunderstood.