|
|
|
|
|
by Gatsky
4223 days ago
|
|
How do you feel about reproducible computing in python? R is very well set up to A) get it running on any platform easily B) report the crucial parts of the environment. I know that if I grab someone else's (published) code written in R, I'm pretty confident I can make it work. Part of this is the great package management through CRAN or Bioconductor, and also because often important reference data for bioinformatics is actually available through the package manager. I haven't done much with Python, but I don't quite get the same feeling (happy to be told that the reality is otherwise!). For example, the opening line of the installation guide for Pandas doesn't inspire great confidence in me: "The easiest way for the majority of users to install pandas is to install it as part of the Anaconda distribution, a cross platform distribution for data analysis and scientific computing."[1] Do I really need to install the HDF5 package so I can split a concatenated variable into two columns?? [1] http://pandas.pydata.org/pandas-docs/stable/install.html |
|
In a previous life what we did was for every project you'd download a snapshot of an R environment, including all packages. That, and only that, was used for all computation for everything involving that project from start to finish. If Docker was around at the time, that's what we'd have used.