Hacker News new | ask | show | jobs
by Dn_Ab 5165 days ago
_delirium's post acknowledges your point but is looking for an even rarer person:

"There could be more of them in the future, but someone who is top-notch at all of statistics, programming, and data-presentation has long been less common than someone who's good at one or two of those".

Someone that can program, understands statistics and can present the data in an appealing manner without losing significant fidelity. Many people underestimate the difficulty and skill required in presenting data in a way that makes sense and also actually says something.

There is a significant gap between presenting data that is satisfactory to a research advisor and something that a business person with barely enough time to think can grasp without misconception.

3 comments

Again, I completely see the difference (and am actually in the process of moving full time to the private sector from academia, so will probably understand a lot more in six months) but visualising data well is not that hard. Step 1: learn R Step 2: Learn PCA Step 3: Learn ggplot2 Step 4: play with the different geoms until you understand them (seriously though, everyone's eyes are optimised to find patterns, and if you can apply significance testing to these then you should be good) Step 5: profit!? Note that I am being somewhat facetious here, but I suspect that the mathematical knowledge and ability to apply this to business problems will be the real limiting factors, as good practices in data analysis, programming and visualisation can be learned. Granted that will take a long time to learn, and there will be individual differences, but its doable.

Whether or not it will be done at all though is another matter.

Again, delirium's point is trivially true if one requires these people to know all of statistics, programming and data presentation as I don't think there's anyone who knows all of any one of these subjects.

I suppose it somewhat depends on what the skill levels for each of these areas need to be, and that varies from person to person as well as from application to application.

Allow a short vignette from a former academic and now management consultant.

We spent six months at a major pharmaceuticals client examining their reimbursement data. Poring over many millions of rows of transaction data and thousands of payment codes (which, of course, were unique across sales geographies), we determined the ten regions at highest risk of reimbursement collapse. R was used, maps were created, beers all around.

But almost none of it was used for the executive presentation. In fact, the only part that was included was that we had ten regions that needed fixing, and our suggestions on how to fix it. You see, the CEO was dyslexic, the chairman of the board was colorblind, and the COO was a white-boarding kind of gal, so given this audience the nuts and bolts of our advanced statistical analysis were simply irrelevant.

This is hardly surprising. If we are having so much trouble hiring people who are fluent in Big Data, how can we expect business leaders to be even conversant? With only slight exaggeration, the way you do your analysis and the visualizations that you create are not important.

Companies are demanding Big Data scientists because they suddenly have lots of data and see the term Data Scientist in the news. But what they really want is not Data Scientists, it's business insights and implications from Big Data. The customer needs 1/4" holes, but we're all arguing over which brand of laser powered diamond drill they should buy.

Nailed it.
I still remember my first business presentation. I had a slide talking about how I did a statistics study. I was told to take the word "study" out because it had bad connotations for the target audience (middle managers at Bristol-Myers Squibb if you're curious).

The comment was probably right. But I was horrified.

I agree, and it's all the more true if you consider that "presenting" data may actually be more like creating an interactive environment to explore data.

I believe that data analysis yields the best results when perusing the data and tuning the models are closely connected tasks.