Hacker News new | ask | show | jobs
by fastaguy88 843 days ago
(1) The big problem I have is transitioning from RStudio to a pipeline (so I end up not using RStudio). A traditional pipeline is going to be a script with some set of arguments -- parameter values, fitting functions, and data file names, that I put into a shell script and say:

my_plot_script.R --plot_col=g_max --output_type=pub_quality data_file1 data_file2 data_file3

It's possible to use optparse/OptionParser() to get that information (but you have an option for every argument, no --param1 X --param2 Y file1 file2 file3) but it is much more difficult to fit those arguments into the RStudio environment. I want an RStudio to be able emulate reading command line arguments (since they do not exist in RStudio). Right now, I have to check to see if there are commandArgs(), and, if not, do something else to get the information to the RStudio script.

(2) There needs to be an option that says STOP if something doesn't make sense. I have dozens of beautiful data plots that look great, but in fact do not in fact plot what I think they do, because factors have not been properly assigned to colors, shapes, or linetypes. (And it can be really hard to recognize that the data has not been plotted properly.) Give me an option that says, if I did not explicitly declare a column a factor, and I did not specifically associate colors/shapes/lines with factors, then the data will not be plotted.

2 comments

(1) You might want to check out https://github.com/t-kalinowski/Rapp by my colleague Tomasz

(2) I think part of that is in scope for strict (https://github.com/hadley/strict). You might also be well served by adopting some more data validation tooling, e.g. pointblank (https://rstudio.github.io/pointblank/).

On point two, can’t you just use stopifnot(condition)? Then log it etc?