Hacker News new | ask | show | jobs
by striking 4224 days ago
The problem here (for me personally, at least) is that Uber is not in the business of selling dates/"encounters" and that people don't expect a ridesharing company to go right for the sexual data. Even OKCupid is straddling the line here with http://blog.okcupid.com/index.php/we-experiment-on-human-bei... noting that:

  To test this, we took pairs of bad matches (actual 30% match) and told them they were exceptionally good for each other (displaying a 90% match.)
That's really not something people like having done to them. And the "HN crowd" shouldn't have an expectation of privacy and decency in data? Of course they're analyzing data, but it's really the viewpoint from which they do it that is unsettling. OKCupid says "no, duh, we're unethical. Deal with it." Uber says "Check it out! We drew a line between social security checks and prostitution!" (as waterlesscloud notes at https://news.ycombinator.com/item?id=8644138 )

There are a million more beneficial ways that people could be using the data. Fighting hunger, poverty, illiteracy, etc., to me, is a "good" use of Big Data. Looking at sexual habits (when you're not selling sex) or openly manipulating people to get data is, to me, a "bad" use.

3 comments

The idea that statistics have a moral imperative to be "decent" is as fascinating as it is ridiculous. Anonymized data is not a privacy breach, and Uber probably doesn't have any data that can help with "hunger, poverty, illiteracy, etc.".

I'm sorry if the idea that "people's short overnight stays are evident in their travel data" makes you blush, but that isn't anyone else's problem.

The whole point is that it's not anonymized and it's being inspected for and used for purposes that have little to do with the ostensible customer-service agreement.

People aren't used to transit companies interrogating them about the purposes of their journeys, they just want the transit company to get them from point A to point B (imagine if they did this when you got in the car: "Where are you going? Why?")

And obviously from a business perspective the more you understand your customers and their motivations the better you can serve them.

But lets not kid ourselves. This isn't anonymized data. Uber's publishing in a format that is unspecific, but they have all of the detailed data and can poke through it and infer things at their leisure, and they have no compunction around how they're doing it or why.

This is why ethics and trust around data collectors is really important. Uber seems pretty cavalier about it, and that actually is a problem.

> But lets not kid ourselves. This isn't anonymized data. Uber's publishing in a format that is unspecific, but they have all of the detailed data and can poke through it and infer things at their leisure, and they have no compunction around how they're doing it or why.

That's a fairly large accusation to make.

This blog post was originally published in 2012 - two years ago. Since then has anything come out that would confirm your suspicions? I haven't seen anything.

Sure, i don't normally like linking to TC but this has a pretty good roundup of links: http://techcrunch.com/2014/11/20/following-pressure-from-u-s...
Well shut my mouth, thanks for the link.

..even if it's from TC (I won't hold it against you).

I included the word "decent" because the way they used data goes beyond people's "overnight stays," they previously analyzed the spending patterns of people and tied it to welfare checks and prostitution. They immediately call it "one of the coolest things about working for a data-driven company like Uber" afterwards. It's bad data science, not only because it's only a correlation and not an experiment so nothing can be proven, but because they use these unproven claims to say outlandish and unethical things.

I meant "decent" in an ethical sense, not in a conservative "don't you look at my 'short overnight stays'" sense.

>It's bad data science, not only because it's only a correlation and not an experiment so nothing can be proven, but because they use these unproven claims to say outlandish and unethical things.

I don't disagree that they've not scaled any sort of pinnacle in data science, but neither do I think what they're reporting is uninteresting.

In what way is what they're saying outlandish and unethical?

A little off-topic, but I don't see why OKCupid's actions here are unethical. Their matching algorithm isn't perfect, so they shouldn't treat it as an oracle of truth. How else would they discover false negatives in their algorithm? Especially since, in this case, a false negative is worse than a false positive (not meeting someone you'll like vs having one unsuccessful date).
> How else would they discover false negatives in their algorithm?

This is exactly why research that deals with humans at Universities invariably must pass a human subjects review process. "How else would we discover X?" is certainly not reason to subject anyone to an unethical experiment. Subjecting people to what you likely believe to be a bad date should very definitely raise red flags, even if the details in practice would pass a human subjects review.

And that's the trouble: there's a tremendous space of research that just isn't ethical to carry out on actual living humans. As such, we have to find methods to determine answers to those questions that don't breach ethical standards. The burdens of discovery must lie squarely on the researchers, not on the (often unwitting) experimental subjects.

Do you think that giving someone an artificially inflated OKCupid match really rises to the standard of an unethical experiment though? OKCupid doesn't tell you who to go on a date with; they just suggest potentially good matches. (Right? I'm married and don't tend to troll dating sites, but that's my understanding.) You're free to read their profile, exchange messages, etc., before arranging a date. If it is indeed a bad match, then most likely you would realize your incompatibility early in the process.
People need to at least understand what's being done and they need to give consent before it happens. Otherwise, you're literally toying with people's lives. And in this case it's not in some insignificant way: you're manipulating their romantic and sexual endeavors.

It's actually far, far more invasive than what Uber did as they described it in the blog post.

Have you read the terms and conditions of your latest bank account? The level of forced-consent to thrid party disclosure may alarm you.
That's a completely different category of life violation though. Imagine instead that your bank was lying to you about your account balance, modifying it to be plus or minus 3% of the actual balance. Without your consent or knowledge. All to conduct a "psychology/market experiment".

Then it would be equivalent.

Nothing in this post involved distortion of customer data. They just linked up transcation time/date and geo-location data. Then did some simple math. It's not out of the question that your payment processor could replicate this analysis...Once your credit card processor cuts a deal to geo-tag your purchase history. Of course almost all fixed POS hardware is geomapped, and the mobile stuff is trackavle, so that's not much of a stretch.
Don't they sell people on their super-accurate-awesomesauce-state-of-the-art matching algorithm? Were people warned that they may be guinea pigs?
It was unethical because they didn't warn their users ahead of time that they might randomly be opted into the alternate pairing system.
The uber post almost certainly did not violate anyone's privacy. They ran a bunch of aggregate queries that probably dropped any pii pretty early on. They did not publish a list of riders who took a ride of glory.

(I say they "probably dropped PII" because when you do work of this sort, PII is boring data that slows down your calculations.)

Similarly, what's wrong with observing a correlation between welfare checks and prostitution? It's an interesting observation. It's potentially useful for public policy and fighting poverty (at least American style relative poverty), though of course a more detailed investigation needs to be done.