Hacker News new | ask | show | jobs
by lyxsus1 2532 days ago
> scraper goes on to use that information in ways that are for the commercial benefit of the scraper and usually to the detriment of the site owner

Yes, and they extract that benefit because in the process they add a value to that information for end-user.

> Am I missing some other fundamental right that applies here?

Yes. When it's not prohibited by law, it's allowed.

2 comments

When it's not prohibited by law, it's allowed.

Frankly, I'm glad I don't live in your community.

I just got back from grocery shopping. Before I went in, I stomped off some mud I had on my feet. As I walked down the aisles, I accidentally knocked a bag of something off a shelf, so I turned around, picked it up, and returned it to its proper spot. When I got to the ice cream coolers, I first looked through the glass to decide what I wanted, and only opened the freezer door when I'd made my selection, sealing the door again once I was done.

There aren't any laws forcing me to do those things. But I'm happy to do them, because (a) when we all cooperate in this manner, it makes life better for all of us; and (b) the additional costs of cleaning and shrinkage would make the groceries more expensive for us if we didn't do it.

Have you ever been to a part of the world that doesn't have the same kind of "high trust" environment that most of us in the West enjoy? For example, in America we can safely assume that at a bus stop, people will of their own accord form up an orderly line when the bus comes, each person waiting their turn. My experience in some other nations is that people will swarm around the bus door, each jockeying to be the next one in. There aren't any laws forcing people to act as they do here, but I think we're all better off with the societal norm that drives people to do it our way.

Oh hey, I got where you were mistaken! :)

You assumed that people who don't want their data being scraped are the nice guys and people who want to use that data are the baddies. Nope, my point was that it's the other way around and if you're truly concerned with community wellbeing you actually have to be on my side of this argument.

I support every point in your response expect that I don't get your motivation to fight for folks who either don't understand how internet works or consciously use dirty tricks to block information access just to protect their shaky profits.

Look it up, my first comment in this thread was about making it illegal because I think everybody would benefit from making it so. If you publish your data and it's accessible by browser and not copyrighted, we shouldn't make it hard to collect that data for automated processing.

use dirty tricks to block information access just to protect their shaky profits.

Look up to the top of this sub-thread, I started it. In that comment I specifically addressed that in the systems I'm responsible for:

- The traffic that (apparently) comes from competitors scraping our prices exceeds the traffic coming from legit customers. We're paying more to supply data to our competitors than to our customers!

- There are sometimes actual reasons beyond wanting "shaky profits" for wanting to limit what site users can do, including development resources to built features and APIs, as well as the actual cost of the computations.

- And I have no idea where your assertion about "dirty tricks" comes from. I'm having trouble finding anything "dirty" in trying to detect people abusing the system, and temporarily blocking their access.

And like I said above, I'm still glad I don't live in your neighborhood. Because not only don't you have any interest in being a positive member of society in commerce (you ignored all my comments on that topic), but I now see that on a personal level (e.g., "You assumed that...") you're also condescending.

> - The traffic that (apparently) comes from competitors scraping our prices exceeds the traffic coming from legit customers. We're paying more to supply data to our competitors than to our customers!

Make an API, put your site behind CDN. Couldn't be more simple. And there's more they could do.

> - There are sometimes actual reasons beyond wanting "shaky profits" for wanting to limit what site users can do, including development resources to built features and APIs, as well as the actual cost of the computations.

Already answered that.

> - And I have no idea where your assertion about "dirty tricks" comes from. I'm having trouble finding anything "dirty" in trying to detect people abusing the system, and temporarily blocking their access.

There was an attempt to make a startup to compare prices in local stores, that caused an outrage among shop owners. They too claimed they were "abused". If you dive into how all those standards like html, http and etc were designed in a first place, you'd find that they were made with an idea that data is expected to be easily digestible by machines. Fighting it is futile and postpones us from having nicer things.

You could just export your prices in some CSV form on regular basis if making a proper API is too hard and redirect incoming scraping traffic to some README page instead of fighting a battle you can't possibly win? That's of course only valid if that business doesn't mostly rely on depriving customers and competitors from information about prices. In that case you have my compassion, but it's clear that pro-community-social-bla-bla-bla rhetoric is nothing more that a disguise. That I understand, but oppose.

Or should I finally respond to that lame remark about neighborhood you're trying to push? Meh.

Sorry to say, but it seems your moral compass isn't working correctly. Not sure its productive at all to continue this conversation with someone who's understanding of that fundamental premise is so completely different than pretty much anyone I know. Hopefully that's something you'll think about.
Well, I beg to differ. I'm against kindergarten level of moral.

Look, luddism is bad for everybody in a long term. We saw riots against Uber in some country where people turn cars upside down or burn them. It hurts their earnings and they feel it's unfair. And you may say it's immoral for Uber to do so. But what they did is simple. They identified multiple huge inefficiencies in that market: pricing, negotiation, checkout, reputation. Solved it and found a way to profit from it. And that will eventually happen at every corner where huge inefficiency exists due to mere lack of communication and price negotiation. And yes, prices in most cases will have to go down and become less disperse and some won't like it at all. But that's just competition reenabled by technology.

Is it immoral? I don't think so, because net effect is positive. Is that your right to scam tourist 10x more for airport-hotel trip or take 2x longer route just to earn a bit more? Technically yes, but who would indorse such behaviour? You don't like the price? Don't take that client. Platform is systemically lowering prices or violates existing regulations? Vote for better regulations, vote for enforcing those that already exist. Platform is fundamentally broken? Well, make a better one. Technically, it's not that hard, Uber is one of the most replicated business ideas at the moment.

And on that guy complaining that oh those competitors who are scraping their prices. So they're scraping each other and protecting their websites from being scraped by each other? So at the end they all have competitors information, but pumped lots of resources in scraping, protection and trying to serve content to both bots and clients? Wow, what a tragedy. What a horrible person would want it to stop. If that's it, they could just as simply pick up a phone and agree to share that data between them, because in the end outcome will be the same, minus resources wasted on arms race.

Or better, from the beginning make your data machine-friendly. Because eventually, they'll do that. Eventually somebody like Google or Amazon or some other big company will find an incentive to make them gradually and willingly share and structure that data. And somebody will find a way and resources to integrate that data into reusable knowledge graphs and somewhere along the way create a positive feedback loop. And somebody will profit from that huge. Consumers will surely benefit, that somebody will, data-donor companies that adapt will do.

And don't forget there's some progress in ML, automated decision making and all that. I personally as a customer would love to have best prices, objective products comparison with zero interaction with multiple whacky small vendors websites. I'd better have smth like Siri do that for me.

One way or another, it's happening. Small businesses have little to say here if anything. My unpopular opinion was to recognise that process, do something to stop wasting resources on war between scrapers and anti-scrapers and hopefully avoid appearance of another single monopolist from solving the problem if we don't. Because if we don't we'll just have another few years of HN headlines about how bad that X unicorn company is to somebody.

Now what's wrong with that?

The fact that you really don't seem to understand the moral issue at all is really disheartening.
So. Instead of giving any arguments you prefer to claim a higher ground based only on some intrinsic quality miraculously shared only by people who agree with you, but some terrible people don't and deserve pity. Nice try, but nope.

I get it, really. World's change quick, people don't. They have life, it hurts, they're sad. We're empathetic, we're sad too. We don't want to be sad, so we don't want them to be sad. See? Easy. Except the fact that it doesn't help anybody other than mild therapeutic effect. And is completely irrelevant to this thread.

If anybody cares to explain and expand what is that moral issue I'm missing here or how is it relevant, I'm all ears.

I feel like I've put much more effort in making that social interaction fruitful and got only lazy "I'm sorry for you" in return. Ouch.