Hacker News new | ask | show | jobs
by sghi 2544 days ago
As someone who uses and teaches R extensively (and loves the tidyverse) the tidyverse is so much easier to teach to people without any programming experience and who have very little faith in their tech or maths skills (which was me as well).

The tidyverse just 'made sense' to me when I started using R for the first time a few years ago, and now I love using R and programming. On the other hand, some of my ex-classmates learnt base R (because that's what we were taught) and found it hard, didn't learn anything properly, and now still think R or other programming languages are opaque and hard.

I'm not particularly fussed if Statistics Profs prefer data.table to dplyr or base R to tidyr, I know what is easier to teach, understand and use for me and a lot of other ecology/bio students and people.

2 comments

I concur with your sentiments having cultivated data science teams from the ground up with diverse educational backgrounds.

Programming in base R is more akin to assembly language and has accreted a babel of inconsistencies that make it difficult to teach and learn. Learning base R isolates you into a Galapagos island of academics who are either ignorant of the needs of data workers or too elitist to engage with those not in their priesthood.

Learning Tidyverse is a considerably better transition for learning other languages, frameworks, and libraries.

Functional programming is closer to algebra than indexing into data structures with magic numbers. I've found more success teaching functional pipelines of data structures using the idioms in Tidyverse as a general framework for data work than base R. Abstraction has a cost but for learning it is the appropriate cost.

I sense that much of this `monopolistic` fear mongering is really about feeling out of date.

"I know what is easier to teach, understand and use"

I think this really depends on the end point. If you want to learn to read data into R and do basic manipulations, plotting, modeling, etc., the Tidyverse absolutely has a lower bar to entry. Once you get into writing functions, it gets a little trickier. Knowing some of the base R programming concepts and skills will make you much more efficient. If you start debugging and profiling code, only knowing the Tidyverse becomes a liability because you fundamentally do not understand R's computational model (the Tidyverse does not follow it). Hence, if your end goal is to write and debug functions in R, the steeper learning curve of base R can more than pay off. If not, then the Tidyverse's low bar to entry can be more attractive.