Anyone have a good explanation on the intuition of non-interactive zero-knowledge proofs? For example, I thought the "paint-mixing" analogy for Diffie-Hellman key exchange (https://en.wikipedia.org/wiki/Diffie–Hellman_key_exchange#Ge...) really helped me handwave the math into "mixing easy, unmixing hard".
An intuitive explanation is that of proving you can find Waldo in a picture without revealing his exact location. Digital wallets can be interpreted as fancy signature schemes that operate on third-party issued commitments C instead of public keys that directly link users to their identities.
A simple signature scheme is based on proof of knowledge PoK{x : pk = g^x}, which is transformed into a noninteractive variant via the Fiat-Shamir transformation, where the message is appended to the hash. Range proofs work similarly, with the simplest form being for a single bit: PoK{(b,r) : C = g^b * h^r & b(b−1)=0}. This proves that commitment C contains a bit b in {0,1} without revealing which value it is.
Arbitrary ranges can then be constructed using the homomorphic properties of commitments. For an n-bit range, this requires n individual bit proofs. Bulletproofs optimize this to O(log n) proof size, enabling practical applications.
The commitment C can be issued by a trusted third party that signs it, and the user can then prove certain properties to a service provider, such as age ranges or location zones (constructed from latitude and longitude bounds).
A key challenge is that reusing the same commitment C creates a tracking identifier, potentially compromising user privacy.
for explanation i've seen for the where's waldo analogy: imagine the single page of the where's waldo puzzle, and another giant piece of paper with the shape of waldo cut out of it.
by providing a picture of waldo in the cut-out, you can prove you know where he is without providing the location. a zero knowledge proof.
I think the Where's Waldo example, while not technically zero knowledge, gives a pretty good intuition of the idea behind it.
It certainly gives a "layperson" example of being able to prove you know something without revealing it, which isn't the whole definition of ZK but is the idea driving it.
Imagine it isn't Waldo, but an unknown figure and you are only given the silhouette to find. If you can draw what's within the silhouette or something, you've proven you've located it to high certainty without saying where.
Say the whole image looked like noise and was generated from quantum measurements, and the coordinates to hash for the problem were generated with quantum measurements, and you were given the silhouette and the hash of the noise within to look for. I could see it for proof of work: you could slide along a hashing window and prove you actually did work examining half the image on average or whatever.
I think my example isn't great and would need to be modified like maybe give the hash of a neighboring area to prove you found it, so your answer couldn't be used by others to find the location much more cheaply.
This is an interactive example, isn't it? It doesn't help me understand non-interactive proofs like SNARKs/STARKs, where the verifier isn't communicating live with the prover.
* The prover commits to a starting value (public input)
* Instead of waiting for an interactive challenge, they hash it and use the resulting hash output as if it were a challenge
If we believe the hash is a random oracle (as we do for cryptographic hash functions), then it is hard for the prover to manipulate the challenges. Is that it?
You got it. There are a few nuisances, e.g. the "theorem statement" must be hashed as well so that proving that name=Mickey has a different oracle than proving that name=Goofy, but your basic understanding is correct.
This is what I was going to post. It helped me a lot by first giving a very intuitive understanding of the concept of ZKPs using the Where's Waldo/puffin-among-the-penguins example, but then also going deeper with the graph-coloring example.
Was looking to see if someone posted this video. The first few interviews are excellent - the later ones, not so much (in terms of explaining ZK - they're good chats, of course).
Usually in an IP, the prover (Bob) has to answer questions from the verifier (Alice), and Alice chooses her questions by flipping a coin. If the Bob doesn’t really know the answer, he’ll get caught cheating with high probability.
So now the trick: Bob starts generates his initial answer. Then he hashes it (“commits” in the jargon), and uses the hash as “Alice’s first coin flip”. Then he answers the question for that flip, hashes the whole thing for “Alice’s second coin flip”… etc.
Bob does this say, 100 times, and then sends the whole simulated conversation to Alice. Alice can verify that he didn’t cheat by checking the intermediate hashes.
The whole thing depends on the ability to not control the result of the hash function, so it’s vital to use a cryptographically secure one.
It "feels" much easier to generate random non-solutions and check if the random questions happen to pass, though. Is it really all there is to it? You increase the number of questions to compensate and that's the whole scheme? Wouldn't the responses be a ludicrous amount of data?
Yes, basically. The hard work happens in constructing the interactive proof to ensure that you can’t reneg on the earlier steps.
All the proofs that I know of allow one to get lucky with probability about .5 in each round. When you do an interactive proof with 100 rounds, you have a 2^-100 chance of getting away with cheating.
When you go non-interactive with 100 rounds, an adversary could cheat by trying about 2^100 proofs. So you replace a stronger guarantee with a weaker one, but 2^100 is a pretty big barrier.
(I just looked and the Wikipedia page and it’s very confusing fwiw)
The trick is called the Fiat-Shamir transform, and you're right, it does require more questions to get an equivalent security level, precisely because you can try a large number of random non-solutions without anyone catching you doing it.
But the number of questions you need to compensate grows only a little.
For example, interactively if you ask for Merkle tree proof that selected leaf values have a particular property, you only have to ask for about k leaves to get probability 1-(2^-k) that you'd catch a dishonest prover who had committed a Merkle root with less than half the leaves having the property.
Non-interactively, a dishonest prover could secretly grind attempts, say 2^g times, and then you'd have a lower probability of catching them, approximately 1-(2^(g-k)). But g can't be all that large, so you can increase k to compensate without making the proof much larger.
You.can also require certain hashes to have a fixed prefix, like Bitcoin mining, forcing every prover to have to grind 2^p times. This reduces the effective g that a dishonest prover can achieve, allowing k to be smaller for the same security, so allowing the non-interactive proof to be smaller. At the cost of honest provers having to grind.
The surprising part of STARKS and SNARKS comes down to the nature of polynomials. It's surprisingly easy to tell two polynomials apart with a small number of random checks (Schwartz Zippel lemma). In light of this it's not surprising there is good reading comparing them to erasure codes which rely on exactly this property of polynomials.
The non-interactive piece is pretty straightforward you just simulate challenge response conversation with unbiasible public randomness and show the transcript (Fiat Shamir transform).
Another area worth exploring is how some of these proof systems can have such incredibly small proofs (192 bytes for any computation in groth16 zk snarks). That relies on the much more difficult to intuit theory of elliptic curve pairing functions.
Yeah I'm also interested in some of the details here, but the linked library repo is a bit too low-level for my current understanding.
For example, in the usecase of providing a proof-of-age to a website: who provides the verification data (the government?); what form does that take (a file in a standard format?); who holds/owns the verification data (the user?); who runs the verification software (the end-user's web browser?).
Can the user use any implementation to provide the proof, or must it be a "blessed" implementation such as Google Wallet?
The specifics depend on local regulations, but roughy speaking: the government gives you a document in a standard format (eg MDOC). Your phone stores the document, with cooperation from a secure element that binds the document to the phone. The website you visit verifies the proof. The government gives documents to whatever wallet they want, which may be a special government wallet. They may or may not give the document to Google Wallet.
> Your phone stores the document, with cooperation from a secure element that binds the document to the phone. The website you visit verifies the proof.
So it does require a "blessed" implementation, and I have to trust Google or Apple to handle my data? I cannot own the document myself and use an open-source client that I trust to provide the proof?
It depends on local regulations. As far as I can tell Europe will require some sort of blessing of the wallet. To be clear, governments will develop their own apps and it's not clear that Google will be blessed. We (Google) are giving them the code pro bono to improve privacy.
Hmm. This introduces a third party to the protocol, right? Specifically the developer of the wallet. So we now have three parties: the user, the wallet developer, and the relying party. Does this zk protocol protect the user's privacy from the wallet developer as well as the relying party?
In other words, does the protocol give the wallet access to information about the relying party? For example, could this wallet that I don't control tell its owner, or the government, that I am using it to access a certain website?
Yes, a malicious wallet could leak your information. This is why some governments will insist on using only blessed wallets. However, wallet+zk is strictly better than sending the plaintext MDOC to the relying party.
There are no solutions in this space, only tradeoffs, and elected representatives have picked one tradeoff.
In principle, you could use an open source implementation, but not a user-modifiable implementation.
Nothing stops a government from making their code open source and providing you with reproducible builds. You just won't be able to change the code to do something the government doesn't deem legal.
(1) in this case, an identity issuer provides the source of truth identity information. Examples include state DMV, your passport (you can try "Id pass" in Google wallet), etc.
(2) One of the goals of this project was to layer ZK on top of current identity standards that DMVs already issue, so that gov orgs don't have to change what they currently do to support the strongest user privacy. One example format is called Mdoc.
(3) The user holds the identity information on their device only. No other copies. The user's device makes the zkp proof on-device. This was one of the major technical challenges.
(4) The relying party (eg a website) runs the zk verification algorithm on the proof that is produced by the device to ensure soundness.
(5) Yes, the user can use any compatible implementation to produce the proof. We have open-sourced our implementation and we have a spec for the proof format that others can also reimplement.
If you can achieve RCE on the chip and run arbitrary code without invalidating signatures, does the protocol still stay secure?
If so, what's the point of requiring your implementation to run on a verified secure element? If not, the protocol seems only as strong as the weakest chip, as obtaining just a single private key from a single chip would let you generate arbitrary proofs.
The role of the secure element is only to "bind" the credential to the device, so that if you copy the credential somewhere else then the credential is useless. Concretely, the secure element produces a ECDSA signature that must be presented together with the credential. This is the normal protocol without ZKP. Concretely, the SE is in the phone, but could be a yubikey or something else.
The ZKP library does not run on the secure element. It runs on the normal CPU and produces a proof that the ECDSA signature from the SE is valid (and that the ECDSA signature from the issuer is valid, and that the credential has not expired, and ...) If you crack the ZKP library, all you are doing is producing an incorrect proof that will not verify.
Am I correctly understanding that I'd get the credential from say my state DMV once, and then later whenever I want to prove my age to a website the proof protocol is just between that website and my device? The DMV gets no information about what websites I use the DMV credential with and they get no information about when I use the credential even if the website and the DMV decide to cooperate? All they would be able to get was that at time T someone used a credential on the site that came from the DMV?
I tried to sketch out a design an age verification system, but it involved the DMV in each verification, which made timing attacks a problem. Briefly the website would issue a token, you'd get a blind signature of the token from the DMV's "this person is 18+" service, and return the token and unblinded signature to the website. I think that can be made to work but if the site and DMV cooperated they would likely be able to unmask many anonymous site users by comparing timing.
Getting the DMV out of the picture once your device is set up with the credential from them nicely eliminates that problem.
You are correct. The property that the colluding website and DMV still cannot identify you is called "unlinkability" and as far as I can tell cannot be achieved without zero-knowledge proofs. See https://github.com/user-attachments/files/15904122/cryptogra... for a discussion on this issue.
However, the timing attack resurfaces once you allow the DMV to revoke credentials. Exactly how the revocation is done matters. We are actively pushing back against solutions that require the DMV to be contacted to verify that the credential has not been revoked at presentation time, but this is a very nuanced discussion with inevitable tradeoffs between privacy and security.
No. ZK has a technical definition I don't want to get into, but note that the described system is deterministic and it always produces the same proof for Alice on a given day, and the proof for a later day can be derived from the proof for an earlier day. So two proofs can be linked back to Alice, and thus the system is not ZK. You need some kind of randomness for ZK.
Thanks for the reply. So in theory, I could get this MDOC file and store it on my desktop computer, and use an open-source library whose behavior I can verify, to provide the proof to the website via my web browser. Yeah? This sounds good to me.
No. Using the MDOC requires a signature from a hardware security key in the phone, and a lot of the complexity is how to avoid leaking the private key, which would identify you.
An alternative would be some secure chip in a credit-card size plastic document, but nobody seems to like that idea. We (Google) don't make these choices.
Are you trying to say that there’s a signed blob called an MDOC, that happens to have the age and name of the user, and this library allows a website to prove that the provided age belongs to the person with the MDOC, but not also see the name?
But to be clear, mdoc already accounts for this through its selective disclosure protocol, without the need for a zero knowledge proof technology. When you share an mdoc you are really just sharing a signed pile of hashes ("mobile security object") and then you can choose which salted pre-images to share along with the pile of hashes. So for example your name and your birth date are two separate data elements and sharing your MSO will share the hashes for both, but you might only choose to share the pre-image representing your birthday, or even a simple boolean claim that you are over 21 years old.
What you don't get with this scheme (and which zero knowledge proofs can provide) is protection against correlation: if you sign into the same site twice or sign into different sites, can the site owners recognize that it is the same user? With the design of the core mdoc selector disclosure protocol, the answer is yes.
It is decentralized. The holder provides the data, which was ultimately provided to them by the government, they're the client. The verifier is the entity that wants to know how old the holder is, the server.
The form are eg things like the JSON Web Token (JWT), Digital Credentials, and the Federated Credential Management API (FedCM).[1][2][3][4][5] The software can be anything since they're expected to use open protocols, so yes, web browsers.[6] Per the Commission, "For remote presentation flows, … the Wallet Instance implements the OpenID for Verifiable Presentation protocol OpenID4VP in combination with the W3C Digital Credentials API."[7]
The explanation that one person gave me was basically that you use an RNG to generate the challenges. Not sure if this is quite "proper", but it makes sense to me so long as you can't game the system. Perhaps make the RNG slow to prevent picking a convenient sequence?
You want to prove to everyone that you know where the Waldo on Page 12 of Where's Waldo In Iceland, so you hold a big white sheet of paper with a hole in it in front of the page such that the hole is centered on Waldo. Then you let your friend see. Your friend now knows that you know where Waldo is, but they still don't know where Waldo is, because they don't know the relative position of the book under the sheet. This is also why they can't use your proof to falsely prove to anyone else that they know where Waldo is too.
A simple signature scheme is based on proof of knowledge PoK{x : pk = g^x}, which is transformed into a noninteractive variant via the Fiat-Shamir transformation, where the message is appended to the hash. Range proofs work similarly, with the simplest form being for a single bit: PoK{(b,r) : C = g^b * h^r & b(b−1)=0}. This proves that commitment C contains a bit b in {0,1} without revealing which value it is.
Arbitrary ranges can then be constructed using the homomorphic properties of commitments. For an n-bit range, this requires n individual bit proofs. Bulletproofs optimize this to O(log n) proof size, enabling practical applications.
The commitment C can be issued by a trusted third party that signs it, and the user can then prove certain properties to a service provider, such as age ranges or location zones (constructed from latitude and longitude bounds).
A key challenge is that reusing the same commitment C creates a tracking identifier, potentially compromising user privacy.