If there is, the 'someone else' would still have to do some rate limiting. Otherwise, once again, I can just run this algorithm for all possible phone numbers.
No, no. Read my link :) The whole point is that the definition of 'zero-knowledge' is that (if you use a zero knowledge interactive protocol) you learn no further knowledge.
You are right that the hash-passing is not zero-knowledge. Rather than just verify whether it's in a set, you've accidentally revealed a number. So if you have a set of phone numbers and I have 1 to check if it's in that set, then checking hashes doesn't work - I would end up divulging my number if I used it.
Intersecting a set of observed identifiers against another set is an issue Moxie dealt with on the TextSecures blog[0] (for phone numbers at that). He evaluated some ZKP algorithms and basically concluded this problem is hard or impractical today. I imagine an ad/referral network has many of the same real time constraints and scalability issues.
thanks - this was interesting. Some of the numbers are interesting. So, for example, he says to have ten million clients updating daily, he would need to sustain 40 MB 116 times every second. . . that's like 37120 megabit so if you pay $10 per megabit monthly, that's $371,200 per month. It's pretty bad. On the other hand you do have ten million users.
so the numbers are off, but they're not five orders of magnitude off. If we're going to service ten million users, we might have something of a hosting budget for it.
So while I believe the author that the current trade-offs aren't goods, clever academic mathematics might help in the future. Cryptography itself came from there.
You are right that the hash-passing is not zero-knowledge. Rather than just verify whether it's in a set, you've accidentally revealed a number. So if you have a set of phone numbers and I have 1 to check if it's in that set, then checking hashes doesn't work - I would end up divulging my number if I used it.