Hacker News new | ask | show | jobs
by roel_v 3119 days ago
I complain about this every time a post on R programming comes up here, but my favorite thing to hate (our of many) about R is that there's no way to find out what the directory of the current script is. Imagine someone would want to use relative paths to their data files so that they could version control their scripts and run them unmodified on different machines! We wouldn't want to enable such abominations now would we!
5 comments

I think you need to reference the data files from the working directory, not the directory where the script currently is. The two aren't necessarily the same.

The current working directory can be found with getwd() and set with setwd().

If you set the working directory at the beginning of the script, paths to data files should be relative to that location.

Yes but for example when running from within RStudio, or calling from other scripts, the two aren't the same. Calling from other scripts you can do chdir() first of course, but my point is that you can't sensibly rely in your script on cd and script path to be the same.
I've actually noticed this and was totally blown out of the water by it. I understand you can use getwd() and setwd() but I thought you could simply do relative paths (similar to other languages) but it doesn't always work and I haven't figured it out.

For example, if you are loading a data.frame from a csv, my.df <- as.data.frame(read.csv("file.csv")) seems to work if the R script is in the same directory as the .csv. This is what I tend to do in .Rmd code chunks (which is my primary R workflow). It also tends to work across platforms which is handy as who knows what box I'm going to be hacking away on. However, R's preference for absolute paths in general I find very strange as I'm always on different machines with, of course, different directory structures. Isn't everyone?

Regardless, R is funky but I think I like it in a sort of awkward 'first date not sure yet' kind of vibe. I'm a noob and novice programmer otherwise though so who knows.

Maybe I am misunderstanding your question, but isn't that just getwd()?
No, that gets you the working directory, which isn't always the same (like, when running from RStudio, getwd() returns the RStudio installation path IIRC).
if you run scripts non interactively, you could try commandArgs? That should contain the file path. For Rstudio maybe the rstudioapi package has a function like that...
Well yes, there are several workarounds; to the point that there are packages that wrap up all methods and try to decide which one is the correct one in the given invocation. This is the problem with R - there are many things for which you need only a single line to do something very complicated, but there are also many things that are just a tiny bit different from the standard cases, and are absurdly complex. Everything is just slapped together, without thought for the overall picture or overarching design.

Google stack overflow for 'R get current script path' some time, and weep not only at how often this is asked and upvoted (i.e., how many people suffer from this), but also at the suggestions offered - how divergent they are, and how complicated. But this is just one example. R is death by a thousand cuts.

not gonna argue that : D Been working with R for 3+ years now and totally second the "there are also many things that are just a tiny bit different from the standard cases, and are absurdly complex".

I can gradually move on to python at work now, which so far has been much more pleasant. It always surprises me what you can end up doing in R though, but really shouldn't if you want to go to production : )

getwd() returns the working directory, which can be set with setwd(), even from within RStudio. I'm still not sure what the problem is.
Not sure if it solves your problem but `source(file, chdir = TRUE)` can be useful.
rstudioapi::getActiveDocumentContext()$path

I believe thats is what you are looking for.

Yes, and now I want to also make it work when not invoked from RStudio; and for various R version. So now I find myself wrapping all these options into a function, which I have to copy for every 10 line script. So then I make a package for it; or use the functions in someone else's package and add a dependency which I'm not sure will still work a year from now.

Or I could just use a sane language and go home in time for dinner.

(I mean I know about all the solutions and non-solutions; I've looked into this at least a dozen times over the last 5+ years. My point is that this shouldn't have been an issue in the first place.)

You are absolutely right, but then either you first post was missworded or I missunderstood the issue (most likely the latter), as there is a way to know the directory of the script.

100% agree with R is not a sane language.

Ah yes now I see - I said 'there's no way to find the current script' which isn't true. So that's probably what the others in this thread are also objecting against :) I guess what I meant was 'there's no same way' or 'look at how hard it is to do this tiny thingy which anyone with a programming background would find so basic, they wouldn't even consider it might not exist'. So yeah, I did screw up on making my point there.
So you need to have a specific IDE installed for this to work?
Nope! Base R works great. Old-school vi to edit scripts, and R base installation to run them (or REPL around). Of course, the IDEs do offer a lot of support, and RStudio is great for making your R functions into packages that are easy to share.
That was in reference to the rstudioapi package for finding the path of the current file, which I've just checked out needs a running Rstudio session to work.