| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ebtalley 5114 days ago
	data would be * IP addresses, one can assume a bot would only have a set of addresses they could use, barring botnets. * request patterns, ie: did the bot request css/js, etc * request timeframes * UA strings Sure, its a big data problem, but I can imagine that Facebook has solved these types of scenarios many times over.

1 comments

zburt 5114 days ago

What if you start a new Amazon EC2 spot instance (netting you a new IP address), start up Chromium in headless mode (say, using Xvfb), navigate to the website of choice, use mouse automation to start clicking around, click the ad, spend 5 minutes clicking around in a semi-choreographed pattern on the advertisee's website, and then shut down the instance -- only to repeat?

Moreover, Amazon is always buying new IP subnets.

link

jonknee 5114 days ago

It sounds like you don't need to go to that much hassle currently, but even that rigmarole is simple enough to combat. The user account should be real, the usage real (comments, photos, messages back and forth) and the friends also real. False positive spam Ids are OK, that will lower your revenue but won't constitute fraud with your customers. Put up a test for uses you think are spamming, the test they already do of identifying photos of your friends would be a good one.

Large numbers of real looking fake accounts should be hard to keep up.

link

kirubakaran 5114 days ago

The user from that new IP won't have any real human like history - photos shared and commented on over time etc.

link

freeall 5114 days ago

But why would you go through all that to click on ads? If I click on an ad for "Some Record Company" how does that make me money?

link

aidos 5114 days ago

It doesn't always need to make you money, sometimes you might just want it to cost your competitor money.

link

geori 5114 days ago

I dont count clicks from any amazonaws or ec2 hostnames on my site.

link

radicalbyte 5114 days ago

Then pay Amazon for a list of their EC2 IPs, or obtain that information from a public source (i.e. RIPE, ARIN).

link