Hacker News new | ask | show | jobs
by wokwokwok 1247 days ago
Having migrated 1000s of lines of legacy r, all I can say is… yes, but then you have you to use r. (:

R is not a replacement for pandas.

R is it’s own special little painful ecosystem, loved by people who don’t have to maintain the code they write.

You can complain all you like about pandas, but at the end of the day, it’s python. Python tooling works with it. The python ecosystem works with it.

It’s not without faults, but at least you’ll have a community of people to help when things go wrong.

R, not so much.

(Spoken as jaded developer who had to support r on databricks, which is deep in the hell of “well, it’s not really a tier one language” even from their support team)

1 comments

Having written tens of thousands of lines of R code that I've been maintaining and using for production pipelines for 9 years...

Sounds like you worked with (or wrote) really bad R code.

The point I’m making isn’t that you can write bad r code; you can write bad code in any language.

…the point I’m making is that when you already have bad r code, it’s a massive pain in the ass.

Bad python is terrible too, but lots of people know python, and it’s easy to find help to unduck bad python code and turn it into maintainable code.

That has not been my experience with r. Ever. At any organisation.

Your experience may vary. (:

> when you already have bad r code, it’s a massive pain in the ass.

Do you think it’s because R code tends to be written by statisticians and stats-adjacent domain experts who don’t necessarily know how to write clean code while Python code has at least some input from actual programmers? Or is this really down to the language itself?

Can you elaborate please? This hasn’t been my experience whatsoever, curious what the issues have been
? I'm not sure I can say more than I already did, but I'll try to be more specific:

The R community is categorically smaller than the python community. The support on community forums is harder to get, or non-existent (eg. with databricks).

Are you saying you've worked in places where its easier to find people that are familiar with R to help work on a project than it is to find people are familiar with python?

That you've found its easier to hire people who are familiar with R than it is to hire people who are familiar with python?

I... all I can say is that has not been my experience.

The places I've worked, of all the developers a small handful of people use R, and a small subset of those are good at it.

I don't hate R. I don't think it's a bad language. I'm saying: It's harder to support, because it's obscure, rarely used by most developers, and the people who use it and know it well are rare and expensive.

As a data engineer, expected to support workflows in production: Don't use obscure crap and expect other people to support it. Not R. Not rust. Not pony.

Using R on databricks, specifically, is a) unsupported^, and b) obscure and c) buggy. Don't do it.

(^ sorry, it's a 'tier 2 language' if you speak to DB representative, which means bugs don't count and new features don't get support)

All I can say, is that my experience has been that supporting python has been less painful; it's a simple known quantity, and its easy to scale up a team to fix projects if you need to.

Thanks for sharing. Seems your issues are more with databricks than R, but certainly R is more obscure.

At least in my experience we’ve never had issues with people learning it on the job and far fewer software issues from eg versioning, dependencies, regression bugs. It just works, there’s rarely even a need for a venv.

I’d never expect it applied as a general purpose language like python though, typical projects are <1k lines of some specific data task, perhaps our use cases are just different

I think the difference is working in teams or with other people's code. Bad python is usually fairly readable, there is a sense of "pythonic" code that the language pushes you towards. R is the complete opposite, there are 50 different ways to do every simple thing, coupled with R users generally not having much sense of good code practices.

Maybe I'm just jaded because I've inherited a 100K+ line R codebase at my job written by a single person with no unit tests and about 3 lines of comments, and it's a completely miserable experience.