Hacker News new | ask | show | jobs
by Karunamon 4946 days ago
Business idea: A "chaff box" that can be sold to the public.

Given a list of dodgy search keywords, youtube links, etc etc etc, regularly updated from a central location (think like a websense blocklist but in reverse), uses a configurable amount of bandwidth. Hits these sites with a human-like usage pattern when HTTP traffic from your LAN IP is detected (so it only works when you're actually browsing the web).

Plug it in and gain plausible deniability from most forms of government shenaniganery. Given critical mass, makes most forms of government behavioral analysis (and possibly advertiser behavioral analysis) useless.

Build it on the raspberry pi or similar platform. Materials cost is $35 plus shipping materials. Main time investment is limited to maintaining the blocklist and the central servers.

Hmm. Wonder how this could sell to the soccer mom crowd...

Would also raise some interesting and thorny questions for the server side. If enough people are using the box for the effect to be meaningful, then a lot of sites are going to have a lot of useless web traffic; yet allowing sites to "opt out" or having an identifier of some kind of the box's traffic completely defeats the purpose of the system.

9 comments

There's a story by Cory Doctorow, in which terrorists blow up Bay Bridge and the US establish a surveillance state in the wake of those events. In response, the protagonist creates a distributed system using Xboxes that pretty much works like the way you're suggesting.
The story is called Little Brother for those who are interested.
I'll admit I don't know all that much about machine learning and statistics, but it seems like it would be pretty hard to simulate human activity in a way that was really indistinguishable (highly sporadic, with trends of connected ideas, for a start). More immediately, most people are never going to get on board with making it look like they're into "bad stuff". It's icky, and they don't think they have that much to lose.
This is an interesting idea. However, for a noise box to be effective requires that a significant number of people are also using a noise box, which assures plausible deniability the same way TOR and shared-IP VPNs do.

If you're the only one using a noise box, or are part of a very small minority of users that do, the random noise you generate is just increasing your attack surface through which the government can more easily target and identify you.

Tor essentially provides the same plausible deniability to its end-node users, without needing to simulate human behavior.
Forget the box, you just need a web browser plugin. It could sit in the background. It could have two lists (updated occasionally like spamblockers do it), one of search engines and one of spook-luring phrases. Every, say rand(1..10) minutes, it could make a few connected queries from list B to some engine in list A. Visit a link or two from the gotten page. Stop after say rand(1..10) queries total on that theme. Throw everything away and go back to sleep.

If a million people installed this plugin, that would avg 5 queries every 5 minutes, that would be avg 1.4e9 queries per day, a tiny fraction of the intertubes.

edit: but, apology to parent, you'd never sell a browser plugin...

If it's from a central location and all clients are working off the same database, it seems like it would be fairly simple for their data mining teams to sift out the identical chaff.
In theory, couldn't an interested party filter out the blocklist from the all other traffic? It would be more interesting to have a dynamic list that gets updated based on actual user behavior. That way, the interested party wouldn't be able to filter out the blocklist without potentially filtering out actual traffic. This of course would create all kinds of legal issues.
In theory yes, but everything hinges on the "given critical mass" thing - once a large amount of the sites the government would look askance at you for visiting are on the chaffbox's list and thefore being browsed by a large amount of people, it serves to protect someone who wants to view one of these sites legitimately.

A dynamic list would be better,granted, but a much harder nut to crack.

I think there are already a couple products like that....in fact I have a few in my house.

I call them my cable modem and tivo

You'd want it to be decentralized. Centralized servers are so 2000.
Tell that to everyone storing their emails and music in the cloud.