Hacker News new | ask | show | jobs
by pmarreck 6 days ago
Related: I'm writing a file-format validator whose first focus is on image data.

https://validate.pics

Also related: I'm using LLM assistance to write it, but I also have a test suite that proves it's working (I call it the "shotgun" suite: given a good image file, it first flips a random bit ("sniper"), then a random byte ("boltgun") and then a random 4096 byte segment which is the typical sector size ("shotgun"); each time it tries to validate the file by decoding it fully, and records what percentage of time it is detected at each scope, and it collects statistics about this over hundreds of times.)

The point of it is to detect things like corrupt data and bitrot... across 240+ different filetypes so far... since no other tool really exists yet in this space to do that.

Note that some formats, notably Apple's HEIC, are so data-dense that corruption only results in undetectable image corruption (well, a human would notice it, but an algorithm cannot!) So I have ANOTHER app coming to help with that which does detection AND repair (to a point). ;)

The CLI will be free and open-source, but I'm also writing a for-sale-in-future private-source GUI for it.