Hacker News new | ask | show | jobs
by bonerman69 2453 days ago
> for Bandwidth Alliance partners, we’re going to hand the IP of the bot to the partner and get the bot kicked offline;

What's that mean, 'kicked offline'?

Isn't scraping 'legal'?

3 comments

There's a lot more to bots than just scraping... DDoS bots, bots that buy hot sneakers before the public gets a chance and drive up the price, bots that go credential stuffing, bots that play nasty tricks with airline seats ...
Only two of those are malicious.

Buying products at the price offered is perfectly legitimate, regardless of all the scaremongering to the contrary.

Not sure what "legitimate" means here? Legal? Running aggressive web crawlers is in many instances against the rules for consumer cloud servers. For example, AWS requires that you obey robots.txt if you run a crawler there. https://aws.amazon.com/premiumsupport/knowledge-center/repor...

In my experience a lot of bots seem to be running on hacked servers or through hacked/insecure proxies. I'd imagine tracking down the owner or someone upstream of those boxes could be effective in taking them offline.

What does that have to with my point? Bots used to purchase inventory (and that aren't otherwise commiting fraud by using stolen credit cards or something) are not malicious.
> Bots used to purchase inventory are not malicious.

There's no way you're in this conversation without being aware that scalping is a controversial practice at best.

https://theconversation.com/the-economics-of-ticket-scalping...

https://en.wikipedia.org/wiki/Ticket_scalping

I'm well aware that many economically illiterate people like to scaremonger about scalping.

That doesn't make them right.

Are they following the sneaker website's robots.txt while doing that? If not, they are probably violating the AWS terms regardless of whether you believe that activity is "malicious."
if they're running on AWS, which most crawlers are not

When I've run scraping software in the past I used DigitalOcean, which doesn't contain a requirement to abide by robots.txt. As far as I can tell it's both legal and consistent with their ToS to run a program that makes purchases on a website.

They don't seem to specify that on this post. I hope it's merely miscommunication...
Then my question to you would be, is it possible for a "legitimate", robots.txt respecting scraper-bot (for a non-profit I'm helping for example) to get caught in CF's detector? If so, is there a way to detect that this is happening and an avenue to get unblocked?
I'm worried about the article's comment:

> And the unwitting users who are part of the botnet have their resources, such as their home broadband connection, used without their consent or knowledge.

Perhaps I missed something, but doesn't this mean that a lot of homes and IoT devices that have been compromised will have increased CPU usage as a result of CF implementing the bad bot behaviour envisaged? In other words, the botnet owners won't care in the slightest about your response, but a lot of homes will suddenly get shoddy performance as their router grinds to a halt.

(Yes, of course it'd be best if every home and every IoT device were secure... but that's unrealistically optimistic)

>This type of attack hurts multiple targets as well: the ecommerce site has real frustrated users who can’t purchase the in demand item. The real users who are losing out on inventory to an attacker who is just there to skim off the largest profit possible. And the unwitting users who are part of the botnet have their resources, such as their home broadband connection, used without their consent or knowledge.

Have you run this by an economist? It's pretty basic economics that "scalping" increases consumer welfare, despite your cursory claim to the contrary.

If the IPs are part of a botnet, that's one thing. But the biggest residential IP network is luminati, which does have consent for their IPs.

> It's pretty basic economics that "scalping" increases consumer welfare, despite your cursory claim to the contrary.

Citations needed.

Here's some reading info and links for you to illustrate that you're comments are ignorant of several dimensions of the discussion: https://economics.stackexchange.com/questions/6576/is-scalpi...

All of the comments there either agree with me or are talking about non-economic factors. E.g. the first one explicitly says it's not about economic harm. The second says they're taking producer surplus, which is correct if you interpret that as potential surplus had the producers priced higher. The third one also explicitly points out that it's not being looked at from an economic perspective.
CloudFlare isn't in the business of enforcing laws, they enforce whatever arbitrary decision they make.
>Isn't scraping 'legal'?

Assuming you're building on the LI<>HiQ case here, the ruling would only be applicable to a subset of cases (public, user-generated content, no authorization...).

Even before the ruling is overturned, we can't say scrapping is legal without applying some qualifying conditions.