Hacker News new | ask | show | jobs
What does social science have to offer the data industry? (towardsdatascience.com)
38 points by schaunwheeler 2964 days ago
8 comments

I think there is an important lesson in the approach to science that we see in the social sciences. Coming from a technical background we approach science, and data, as being the fundamental way of discover truth, but with humans there are often more than one truth.

We’ve seen the effect of it in management over the past 25 years. Today a good manager is expected to approach a team, not by instructing them in what to do and when to do it, but rather by creating a shared meaning through group conversation. It’s more important when you manage people who produce by thinking and being creative, but even at the factory line, this softer approach is proving useful.

We haven’t yet applied this to big data. I’m often sold ML as the ability to predict the future, and to some extend that is true. If I look at all the alcoholic families in my municipality and compare their case history with big data gathered on a national level, I’ll certainly be able to predict how many of their children we’ll need to remove. I just can’t predict which ones because determinism doesn’t actually work on something that complex.

The more data we have the less we understand about causality, something I’ve learned from history. If you look at the Roman Empire without digging into it, chosing Christianity seem obvious, but if you really get all the data on their options and then try to figure out why they did like they did, you’ll have no clue. Another example is online advertising, I read a news paper that I’ve never seen a single add for, and I see a lot of adds for news papers. I’m often called by news paper salesmen as well, but not for the one I read. This is because it doesn’t suit my elaborate online profile. My profile tells the add agencies what I should read, but it doesn’t tell them why, and the difference is failing them.

If we really want ML and big data to be truely useful, I think we need to learn from the social sciences, because they work much more with the complicated science behind the why.

Except that they still can't do what the author needs in order to do her work. A tool she can use. Social Sciences have not yet produced anything that can effectively produce a change or even adequately describe a social system.

Her tagline is appropriate. She'll gladly use a tool that works, but she won't use one on faith.

You're right, but I think my point is that data science hasn't really done anything useful in fields involving humans.

Don't get me wrong, we do a lot of data science in the public sector these years, but we also measure it's efficiency and capability and compare it to the past 100 years of us doing the same thing without AI, and things haven't improved. At least not yet.

Mean while, the social sciences have given us tool that help us inflict actual lasting change on groups of people simply by using language in a specific way or working toward a shared consensus.

So maybe the question shouldn't be what social sciences have to offer AI research, but rather, what data science has to offer social fields.

I'm well aware that data science has it's value in other fields. We use it to troll through massive amounts case files and save thousands of man-hours in the process, but why would you want to use social science for that?

>"data science hasn't really done anything useful in fields involving humans."

So you think data science has not been useful to google, amazon, facebook, etc?

Basically you are proposing that (at least) hundreds of billions of dollars have been spent with no return so far. It is possible but is there evidence for this?

>"If I look at all the alcoholic families in my municipality and compare their case history with big data gathered on a national level, I’ll certainly be able to predict how many of their children we’ll need to remove. I just can’t predict which ones because determinism doesn’t actually work on something that complex. [...] I think we need to learn from the social sciences, because they work much more with the complicated science behind the why."

Huh? ML classifiers will definitely give you a prediction for each individual case. Its the social sciences that have been choosing to look at an average effect at one single timepoint, etc and trying to get some kind of causal model from that (a dumb idea in my opinion since causality is working at the individual level).

EDIT:

I should also say I am open to the idea that causality isn't a real, or at least interesting, thing anyway. Eg PV = nRT, does that mean changing pressure changes the temperature or vice versa?

The gas law is a equilibrium condition, so if you change one the others must change too. But they don’t change spontaneously, you impose the change from without, so there is no ambiguity: whatever you force to change first will be the cause of the others changing.
Right, the model accounts for all possible "causal routes", turning causality into something subjective.
I think having a background in sociology, psychology, would be extremely good for ai researchers and anybody who needs to work with machine learning in general because ultimately those systems will have to interact with people, and a lot of those algorithms will have a huge impact on people’s lives, and you need to be sure that you aren’t encoding biases, etc that are going to harm people or unfairly exclude them.
With a name like empath, I'm compelled to wonder if what you're maybe suggesting is not that they remove their biases, but rather that they apply their own biases in order to achieve a more desirable outcome in accordance with their contextualization of the data and desired outcomes.

I'm not necessarily suggesting that as a bad thing, I just wanted to clarify that it's actually very easy to come up with some very unpleasant data that is totally devoid of bias at all.

In some ways couldn’t that be considered social engineering through ai? If the training set shows bias and this so does the algorithm, it’s a reflection of reality. Manually changing the algorithm to influence society on a grand scale is dystopian. It’s a dangerous path that sounds humanitarian but is really authoritarian.
The problem is that algorithms tend to amplify bias, rather than just reflect it. We’re constantly being told (implicitly or explicitly) that we should trust ML because computers are objective, but that ignores many of the most important variables in the training set.

Google’s Deep Dream is a great way to visualize this. Given a source image that you repeatedly feed through an algorithm that attempts to parse and recreate the image, an unbiased algorithm would produce something similar to the original. Instead you get dogfishbirds and eyes everywhere — that’s the bias of the training set getting amplified.

DeepDream is an apt example on a deeper level of analogy. What they wanted was an advance in the important topic of interpretability/explainability. Offshot of a failed experiment turned into the subfield of style transfer and pretty pictures. That became a success of AI somehow, one to talk about and dazzle audience with.

About the OP ignorance, well, statistics started off as a social science, so maybe self professed data scientists while looking into social sciences, ethics and psychology, could also look about history.

Your example doesn't make sense-like at all-

If you want an example about bias look at Compass and recidivism

Ask not what social science can offer the data industry, ask what the data industry can offer social science.
Ouch.

I think one way to think of this is, does a given field have any tools that, even if you disagreed with their values, you would still want access to? People who don't like the idea of natural selection, still want doctors to take into account the phenomenon of antibiotic-resistant infections in their treatment. People who dislike the values of the software industry, often still want access to computers to publish their essays, surf the internet, etc. People who dislike the analytic, anti-holistic orientation of the physical sciences, still want access to the technology made using that.

What is there in the social sciences that you would want access to, even if you did not share their values? I think we may live to see a day when there is something, but I'm not sure that right now there is (yet).

You don't like cost-benefit analysis? Or city planning?

I suspect some topics which you might think are just "common sense" come from intensive research.

I could believe that intensive research can be useful, but from the cases I have seen of cost-benefit analysis or city planning actually being used, no I don't like those. Cost-benefit analysis (as actually used by business) seems to leave out any strategic advantage that cannot be quantified, and city planning (as actually practiced by cities) seems to be responsible for a lot of what went wrong in the last half of the 20th century in America's cities.

Of course, it could well be that the tools of social scientists were being mishandled by amatuers; I could pretty easily believe that. But as examples, those two both look to me to be net negatives.

> leave out any strategic advantage that cannot be quantified

Everything can be quantified, no matter how intangible. Perhaps you're not familiar with the research.

> city planning (as actually practiced by cities)

On the whole, city governments ignore city planning researchers.

What's your academic background? If you've never studied social sciences, you may not be aware of what the scientists are saying.

I've thought of one social technology I'd really like: a Pigovian tax to reduce pollution, especially greenhouse gasses.
Since the data industry is using repeated real-life experiments on humans, can't we say that the data industry IS social science (or part of it)?
I'd go further to say that the data science community is doing social science better than the SocSci community in this regard.

The problem here is that because data science is industrially driven, there is no separation between measurement and influence. So, whereas academic sciences try not to influence behavior, but rather understand it, data science is trying to influence user behavior with understanding and ethics as an afterthought.

What does data science has to offer to society? Currently not much either. #sarcasm
I think the author forgets that there are quantitative as well as qualitative social sciences. An anthropologist like the author might think that ethnography doesn't have much to offer (though that seems dubious to me too, I'm no expert). But what about econometrics, item response theory work from psych, field experiments and "natural experiments" work from political science, network analysis work from sociology. Social science methodologists have contributed to a lot of the same things industry data scientists work on.
The field of Economics, especially behavioral economics, is probably trying the hardest to develop means of measurement for social phenomena.

At the end of the day though if you want to do data, then you have to have something which you can measure. So far social sciences have not been able to agree on a consistent observable metric for comparison.

Until we can figure out something measurable from first principals then social phenomena will be measured by proxy. Observational data about how people act is the closest we can come today to trying to determine why people act.