Hacker News new | ask | show | jobs
by nerdponx 3556 days ago
I do it at my company. I prototype in R, and then end up having to rewrite chunks of it in Python so it can be worked into our application, which right now is exclusively Python.

It's not a matter of performance, it's just because it would be an enormous amount of engineering overhead to start calling R from inside the Python app

2 comments

Check out opencpu.org, it's an R web api. Really cool stuff.
That seems like you could simply use http://jupyter.org/ and just run the script with R code inline.

http://blog.revolutionanalytics.com/2016/01/pipelining-r-pyt...

Also why not just switch to Pandas it really is a pretty close R clone.

It has nothing to do with interoperability on my machine. I use notebooks (and Pandas) all the time, and I consider myself fluent in bith R and Python.

It's because R is a substantial engineering dependency. As I said, our entire stack is Python and Node. Yes, you can call R from Python using Rpy2, but that's a pro-bono project maintained largely by one person. It's great for casual use, but there is far too much risk to start talking about building critical business code around it.

So why not Pandas?
Personal preference. I switch back-and-forth based on the project.

R data frames are native and feel native. Pandas data frames are non-native and can be a pain in the ass to work with.

That, and there is a lot mpre to the decision than just which data frame implementation I like better.

"Pretty close" as long as you stay within the region of common functionality. I wouldn't say it's a clone.
That is true. I actually started my journey with Pandas and then switched to R for the ecco-system and zero based for data science drove me nuts.

But I do feel that the goal is a clone.

"Python has long been great for data munging and preparation, but less so for data analysis and modeling. pandas helps fill this gap, enabling you to carry out your entire data analysis workflow in Python without having to switch to a more domain specific language like R." http://pandas.pydata.org/

How much experience do you have in statistical computing, out of curiosity?