Hacker News new | ask | show | jobs
by kristjankalm 3487 days ago
This is a false dichotomy. Both OLS regression and, say, random decision forest regression have the same objective (predict values) and achieve it with similar means (build a generative model / function). They solve the same problem. Contrastingly, assembler and python are broadly aimed at completely different use cases.

Broadly, whether you should move from OLS to random forest regression = SNR increase / increase in manhours and money spent.

2 comments

It is actually much easier to apply a random forest (or really gradient boosted decision tree, which almost strictly dominates random forests) than a linear regression. Decision tree methods require far less data preprocessing than linear regression, because the model is able to infer feature relationships. Obviously if your features are linearly related to your target than linear regression is much more viable.
This is absolutely true, the one caveat is that you can explain the significance of features and the relationship to the response variables in simpler terms.
Technically, it is an incorrect analogy not a false dichotomy. A false dichotomy means an incorrect assertion that you have to choose X or Y in a situation.

The GP compares python-vs-assembler and random forests-vs-linear-regression but the analogy breaks because python produces assembler and increases the programmer's general certainty concerning what they are doing. Random forests don't make their user more certain of the results as an application. Basically, Python is a relatively "unleaky" abstraction whereas complex AI algorithms a very "leaky" abstractions.