| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by adamsmith143 1302 days ago
	On the flip side you used to have statisticians writing code that is frankly unusable in a Production environment. You would weep at the R code I've seen and had to turn into something to actually produce business value.

4 comments

fnands 1302 days ago

There is a bit of a joke that a data scientist is someone who can do better stats then the average SWE and can write better code than the average statistician. Both of those are relatively low bars to clear though

link

ketzo 1302 days ago

The way I heard the joke was "a data scientist is someone who's not good enough at math to be a statistician, and not good enough at programming to be a software engineer."

Maybe a little harsh...

link

disgruntledphd2 1302 days ago

That's much better. Consider that stolen.

link

fnands 1302 days ago

Harsh, but funnier than how I phrased it.

link

drgiggles 1302 days ago

This is exactly my point. Let subject matter experts in their respective disciplines handle what they know and communicate through the lingua franca of R. Most data scientists/statisticians probably shouldn't be writing production code, I think that's ok. It's a failing of management to think that coding is coding and not understand the value of true engineering ability.

link

numbsafari 1302 days ago

My first job basically consisted of taking code in FORTRAN and translating it into C++ with robust testing and engineering, and then frontending that code into a ton of spreadsheet packages. So you had quanta doing quant work, software engineers doing software engineering, and analysts and traders being analysts and traders, instead of having quants fail at all three, which is more or less what data science is.

link

esparrohack 1302 days ago

Yeah but in the end it’s just code. And even better, just R.

The business value comes from the stats guy.

link

adamsmith143 1302 days ago

When the R/stats guy quits and you have to figure out which of his 7 notebooks to run in which order and which local files need to be in which local directories to run correctly and which versions of each package are now broken and which code you need to rewrite to fix it you start to realize the value he produced was clicking a lot of buttons in the right order and that overall this doesn't scale at all.

link

esparrohack 1302 days ago

Yeah, but I meant that because the business value is in the stats, and there is such low quality of stats in the field to begin with, it’s borked no matter what.

There’s no point in fixing it. You can just pretend like you did. But if the stat work is quality, then it’s worth the effort to optimize.

link

mellavora 1302 days ago

That sounds more like a jupyter notebook/python problem than an R problem.

but otherwise, yes, I see the problem.

link

adamsmith143 1301 days ago

The hours I have spent debugging package problems in R would disagree.

link

esparrohack 1301 days ago

I know that pain. That’s why I’m saying avoid it if you can do so.

link