Hacker News new | ask | show | jobs
by abhgh 2334 days ago
The intermediate R layer is often a well thought out abstraction - hence it makes a lot of sense to re-use it. At different points in my career I have written skunkworks wrappers for rpart (decision tree learner) and glmnet (generalized linear models with elasticnet). While its true that some subset of the features of these libraries exist in other languages (my primary working language is Python), these are not as feature-rich. To consider the first example, rpart offers the concept of "surrogate splits"[1] that lacks in scikit decision trees. Also scikit doesn't support categorical features (you need to encode them into one-hot vectors), and rpart does.

In short, the intermediate layers often give you well thought out features.

In some cases, you might not even have a corresponding library in your language. For ex, if you wanted to use interaction terms in your linear model, that respects hierarchies, there aren't many options around, but R has glinternet [2].

[1] https://stats.stackexchange.com/questions/50310/how-does-rpa...

[2] https://cran.r-project.org/web/packages/glinternet/index.htm...