> I examined BT's CleanFeed system (its proper name is the BT Anti-Child-Abuse Initiative). This was designed to be a low cost, but highly accurate, system for blocking "child pornography". At first sight it is significant improvement upon existing schemes. However, CleanFeed derives its advantages from employing two separate stages, and this hybrid system is thereby made more fragile because circumvention of either stage, whether by the end user or by the content provider, will cause the blocking to fail.
How would you suggest we get a list of every domain name currently registered? And how would you suggest we identify blocked domains as opposed to redirected, offline, or similar? And how do you recommend we pay for the billions of comput' hour such a task would take?
Don't be so negative. You can get the master .com domain list from Verizon as a download. Identify blocked domains by the holding page and header info. I'm guessing they do something like the guys here with a big page saying seized.
Not really, not only would brute forcing take impractically long (you'd need all IP addresses as well as allocated domains), but the blocking system does individual pages within sites, which could not be found by this method.
The paper linked on his page here - (http://www.cl.cam.ac.uk/~rnc1/)
> I examined BT's CleanFeed system (its proper name is the BT Anti-Child-Abuse Initiative). This was designed to be a low cost, but highly accurate, system for blocking "child pornography". At first sight it is significant improvement upon existing schemes. However, CleanFeed derives its advantages from employing two separate stages, and this hybrid system is thereby made more fragile because circumvention of either stage, whether by the end user or by the content provider, will cause the blocking to fail.