| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by professionalguy 1815 days ago
	That’s cool and everything, but I don’t know many DS people who still use R. Maybe academics still do?

9 comments

iafiaf 1815 days ago

R is overwhelmingly used in bioinformatics. There is nothing quite like bioconductor. Most new tools/methods (for ex, in the scRNA-seq) release R packages first.

link

asdff 1815 days ago

Well I'd say conda is quite like bioconductor with the ease of installing relevant packages. scRNAseq has popular r packages like seurat but also popular python packages like scanpy.

link

fatboy93 1813 days ago

I didn't understand your comment, which is probably my fault.

But you can absolutely install many bioconductor packages from conda.

I love using conda as my environment manager rather than compiling and installing 1000p different libraries and tools.

Also, I install mamba for drastically faster resolution of the dependencies.

link

cardosof 1815 days ago

Really depends on the application. For clean, concise and reproducible ad hoc statistical analytics and modelling, there isn't a better tool than tidyverse+tidymodels.

It's a classic case of the best tool for the job. I usually create simple stuff in R and then move to bigger datasets and production in py+spark.

link

mellavora 1815 days ago

tidyverse may be clean, but it is nowhere near as concise as data.table.

data.table is also typically orders of magnitude faster.

link

cardosof 1814 days ago

Thanks for the point of view, I can't argue since I don't really know data.table. Will check out!

link

stanbiryukov 1814 days ago

Check out dtplyr- lazy data.table backend and tidyverse syntax

link

listenallyall 1815 days ago

Although I agree and don't like R very much, I believe ggplot is still the gold standard for creating top-quality visualizations. None of the python (or other language) clones are quite as good. For projects where the end goal is a complex or detailed graph or plot, it's sometimes worth trudging through R to achieve the best final result.

link

tonyarkles 1814 days ago

Yup, not a data scientist but often do data processing to analyze the outcome of experiments (drone-flight related). I'll use Python/Jupyter if there's a significant amount of clean-up that needs to happen, but R/ggplot is unbeatable if I'm trying to look at the data from different perspectives. As an example, I was trying to look at GPS data the other day and ggplot() + geom_point() + geom_density_2d() was an absolutely perfect way to better grok what was going on.

link

otabdeveloper4 1815 days ago

I've used Python since 1995, so I should be biased, but switching from Python to R is a huge productivity boost - like switching from Excel to Python. R is just years ahead.

link

rcthompson 1815 days ago

R sees significant use both in academic/research settings and industry.

link

Fomite 1815 days ago

It's the dominant language among academic statisticians.

link

wespiser_2018 1815 days ago

There are a lot of DS folks using it for Bayesian Statistics

link

legobmw99 1815 days ago

It’s sadly still quite popular in the research world

link

jazzyjackson 1815 days ago

What about it makes you sad?

link

vore 1815 days ago

Not the original poster, but the language has some really weird edges. For instance, check out this wild behavior: http://www.hep.by/gnu/r-patched/r-lang/R-lang_41.html

link

asdff 1815 days ago

This is because R borrows a lot of syntax from S. When R came out, statisticians were using S, so it was natural to make it like this. If they went another way, you'd get statisticians in mailing lists 20 years ago bemoaning how its so much not like familiar S, rather than regular old programmers 20 years later today who bemoan that R isn't like familiar python like what happens on HN whenever there is an R thread.

link

vore 1815 days ago

I think the behavior is so wildly inconsistent that it's not really justifiable, regardless of being a statistician or not: https://github.com/tidyverse/design/issues/13#issuecomment-4...

link

asdff 1815 days ago

I mean compared to other languages these sorts of quirks might seem like big deals, but they rarely come up. You see that error, you copy paste and find a stack overflow thread explaining it, you know what to do next time and move on. R is certainly no C.

link

kgwgk 1815 days ago

As far as weird edges go, that one is really, really mild. It may even be considered a good idea!

For people interested in weirder things, check The R Inferno (I think it's somewhat outdated by now, though):

https://www.burns-stat.com/documents/books/the-r-inferno/

link

jsmith99 1815 days ago

That book isn't so much about R weirdness. It's more about teaching data scientists to consider the implications of practices like copying a huge table in memory on every loop iteration.

link

hugh-avherald 1815 days ago

Idiosyncrasies are not something unique to R.

One could express the same surprise at an empty list being considered false in some contexts.

link

f6v 1815 days ago

R absolutely dominates some of the life sciences. For example, most of the state-of-the-art bioinformatics tools are in R.

link