Hacker News new | ask | show | jobs
by yihui 3514 days ago
It is not that we encourage a "hackish development style", but computer scientists and statisticians/data analysts are solving different problems, and statisticians' primary job is often not software development. There is not a single absolutely correct style for both groups. You should not expect statisticians to be professional software engineers, or vice versa. We can learn good practice from each other. Statisticians and data analysts often use the EDA approach (Exploratory Data Analysis), and it makes sense to "pollute" the workspace temporarily. Running everything from scratch feels like using punch cards, which is related to the history of S (which in turn inspired R). Statisticians at Bell Labs found it tedious to throw a program to a machine, wait for a day, get hundreds of pages of output the next day, read the output by eyes, modify the program, and do it again. They wanted instant feedback (plots/summary tables) as they explore the data.

We take reproducibility very seriously. The fact that RStudio's Knit button uses a new R session, instead of the current R session, to compile R Markdown documents was a deliberate choice to make sure your output is produced from a clean R session. But if you are doing EDA, it may not be very pleasant to click this button over and over again every time you update your code (you can if you want).

If your course is focused on software engineering, everything you said makes perfect sense. Statisticians can learn the good principles in CS, but they are statisticians after all. There must be tradeoffs.