Hacker News new | ask | show | jobs
by flumpcakes 839 days ago
I worked on research that built machine learning models to take that 'entropy' or sensor pattern noise and to match it against photographs, to trace image lineage when EXIF and similar are stripped out.

For a practical application: as you can imagine, there are certain crimes where it really makes a difference if an image is just on a phone, or if it was verifiably taken by that phone. Possession-of vs. Production-of...

2 comments

That's interesting. But, assuming the research found it possible to verify which device a photo came from based on the sensor noise, doesn't that kind of go against the idea that there is a lot of entropy in sensor pattern noise?
No, because you could have it that the device is always identifiable, but nevertheless producing a randomly varying sequence of images.

Take printers for example, with some known to print a signature. Clearly they can still print a sheet a solid colour (or a number 0-100, or something character or whatever) at random (given some random source & control for doing so I mean) despite the device being identifiable.

I spend a lot of time thinking about randomness, and after running some tests on the entropy of dark images, I have started to believe that there is a lot less entropy in dark CCD images than people think, but there is still enough to get a useful entropy stream.

A substantial portion of the "noise" from a CCD is definitely not random.

I'm curious, how close to raw CCD data did you get from consumer cameras? It wouldn't surprise me if hard-wired camera internal postprocessing often almost immediately regularizes random noise, even with raw images and software postprocessing turned off. Just a wild-ass guess though.
I didn't use consumer cameras to test this, and I assume cloudflare doesn't either.
I was annoyed at the caginess of your answer until I saw you ran a randomness-as-a-service company and these are low grade trade secrets.
If I were going to take a stab at this, I would guess that most of the "camera" is really unnecessary and that you could do this using just the image sensor.

A lot of the camera is just functionality to make actual pictures better that don't apply here. Eg you don't need to control exposure with shutter speed if it's in a black box.

Having a whole camera might even be counterproductive. Eg actuating the shutter is predictable, so it might reduce entropy if actuating the shutter creates a signal that shows up in the randomness.

Or maybe they just mean pro quality cameras, but I'm not sure why you'd want a whole camera instead of just the sensor. Reasons are not readily apparent, and I don't expect anyone to be immediately ready to correct me on trade secrets.