While the likelihood of a collision (whether "accidental" or malicious) is probably still relatively low even with MD5, what reasonable, compelling reason is there to still be using MD5 -- especially when much stronger hash algorithms such as SHA256 and SHA512 are available?
Generate both SHA256 and SHA512 hashes or maybe SHA512 and some other "unrelated" algorithm (if you want to really play it safe), dump them into a checksums.txt (or whatever) file, then PGP/GPG sign that file (with a widely distributed and certified/signed public key, of course) and you can effectively eliminate any chances of a collision whatsoever.
It seems that the benefits of switching would greatly outweigh the costs associated with doing so (unless this would require some major code changes to your processes/pipelines/etc.).
Rather than "vaguely possible" MD5 collisions are now so easy you can make new ones at home on an ordinary PC in a few seconds, and you can do chosen prefix collision (thus enabling you to pick a "format" and collide within that) in a few hours. Cryptography never gets _less_ broken, only more broken with time.
Now, in this particular scenario (a publisher tells us the MD5 of an image, and we can check it to see we got the image) collision isn't so important. We have to trust you anyway, so we may as well trust you to not collide the hash too. But MD5 is no longer even proof against second pre-image attacks, albeit the best known are as yet impractical. This is really bad news.
MD5 has been known to be irrevocably broken since 2004, and had been expected to fall since the mid to late 1990s). DragonFly BSD was only started in 2003. Why use MD5? Imagine if in 2003 you'd decided to build a 16-bit OS, or one that doesn't do TCP/IP, because after all, in the mid-to-late 1990s that might have seemed fine too...
The malicious case would be a bad actor distributing phony files that appear to check out because the flaws have been found that would allow someone to do so.
In practice today this requires that the bad actor collaborates with the real distributor.
MD5 fails to _collision_. A collision is when you can find two different things with the same hash. But being able to collide the hash is NOT the same as being able to find a second pre-image, which is what you'd need in order to get "phony files that appear to check out" if the person who originally issued the MD5 checksums didn't collaborate with you.
There have been practical demonstrations of using padding in several formats to generate valid files that collision with an original one, but with entirely different contents.
This means that it is possible that someone could download the real image, introduce some rootkit, and then tinker that (for instance, by adding a hidden file with carefuly crafted content) until the resulting md5 is the same as that of the original image. Then hack the server and upload the modified image in place of the original one, and everyone who installs Dragonfly is now their minion.
If you use a stronger hash (which is not harder for anyone than using md5), then this attack vector becomes impossible. So... even if it is a remote possibility, just use the stronger hash because it is just a dominant strategy (it has upsides, yet 0 downsides).
There have been no practical demonstrations of colliding the MD5 of an arbitrary file (ie. pre-image attack), only situations where two files are created specifically with the intended purpose of creating a collision. This is precisely what the post you replied to said but you seem to have not understood that there's a distinction.
Yes, it is possible that the DragonFly developers could collude to create two ISOs with the same MD5, one good and one malicious. No, it is not possible that random, evil ne'erdowells could replace the ISO with one with the same MD5, unless the DragonFly developers have conspired with them to make that possible.
If you don't trust the DragonFly developers not to collide the MD5s, you probably shouldn't trust them with the code running in your kernel anyway.
The issue is that if these files are distributed elsewhere by 3. parties, it is trivial for those 3. parties change and compromise the files, but still make the files produce the same MD5 sum.
If you think a preimage attack against MD5 is "trivial" you should demonstrate it. People would be very interested in this because no one has managed to do it yet.
Creating two files with the same MD5 is a very different beast from creating a file with the same MD5 as an arbitrary pre-existing file. These third parties would need to have colluded with the DragonFly developers to make what you're proposing possible.
Generate both SHA256 and SHA512 hashes or maybe SHA512 and some other "unrelated" algorithm (if you want to really play it safe), dump them into a checksums.txt (or whatever) file, then PGP/GPG sign that file (with a widely distributed and certified/signed public key, of course) and you can effectively eliminate any chances of a collision whatsoever.
It seems that the benefits of switching would greatly outweigh the costs associated with doing so (unless this would require some major code changes to your processes/pipelines/etc.).