Hacker News new | ask | show | jobs
by kkielhofner 1741 days ago
The problem with using certificates is any media signed by a party (by nature) traces directly back to that source/certificate. With a certificate-based approach I can imagine something like Shodan meets Google Image Search being used to make it easier to source media for the purposes of enhancing training for an ML model. Needless to say I have serious concerns about this approach.

This is why our approach only embeds a random unique identifier in the asset and requires a client to extract the media identifier to verify integrity, provenance, etc.

There are also two problems at play here - are we trying to verify this media as being as close to the source photons as possible, or are we trying to verify this is what the creator intended to be attributable to them and released for consumption? The reality is everyone from Kim Kardashian to the Associated Press performs some kind of post-sensor procession (anything from cropping, white balance, etc to HEAVY facetunning, who knows what).

1 comments

Ok - I like this for some use cases. To restate my understanding so you can tell me I'm wrong if I am:

I think that it's still the user's job to make sure that they are skeptical of the provenance of any photos that claim to be from, say, the NY Times, that are not viewed in the NYT's viewer (if they were using your system). And then, they should still trust the image only as far as they trust the NYT. But if they're viewing the image the "right" way they can generally believe it's what the NYT intended to put out.

And perhaps, over time, user behavior would adapt to fit that method of media usage, and it would be commonplace.

I am skeptical that that "over time" will come to pass. And I think that users will not be apply appropriate skepticism or verification to images that fit their presuppositions. And I think malicious players (like some mentioned in the article) will attempt to build and propagate user behavior that goes around this system (sharing media on platforms that don't use the client, for instance).

And I guess making that broad set of problems harder or impossible is really what I'd like to solve. I can see how your startup makes good behavior possible, and I guess that's a good first step and good business case.

It's probably best for me to provide an example. We create three URLs for each media asset. For [1], [2], and [3] you can click our icon in the top right corner:

- A link to the asset itself (just the JPEG or whatever with embedded ID) [0]

- A link to a preview with twitter card, facebook open graph, etc support suitable for sharing (and re-sharing) on social media [1]

- A link to an iframe embed for use wherever else [2]

For an asset where the user has configured the metadata to be hidden our verification API doesn't return anything other than the checksum to validate against [3].

Users can update the returned metadata at any time or hiding of the extended metadata and it's updated dynamically and instantly - everywhere. So this way producers and consumers of content don't need to have a dedicated consumption implementation (but it could certainly be branded or white labeled to the content producer). Currently the client is just our javascript but we're working on mobile and browser extensions that can run anywhere against the raw asset link provided in [0].

[0] https://share.tovera.com/assets/c65b0658ab6e4d89963b1e0a319a...

[1] https://share.tovera.com/preview/c65b0658ab6e4d89963b1e0a319...

[2] https://share.tovera.com/embed/c65b0658ab6e4d89963b1e0a319a1...

[3] https://share.tovera.com/preview/e51de3d34bfe47d7bc25fb8f252...