|
|
|
|
|
by posix_compliant
2908 days ago
|
|
Good post, but I couldn't disagree more. Regardless of your business size, it will always be valuable to know information such as: * How does every additional coupon-dollar affect the total amount a customer buys? * What is the relationship between customer age and retention for my store? * Does giving a customer more purchase options help or hurt their chances of making a purchase? My experience is that each of these questions can be solved, in part, using 3 lines of Python code: from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X,y)
Then look at the beta coefficients of the model, and you have a rough idea of how different features are correlated. Doing something like this in SQL sounds difficult. If you have data to interpret, it makes sense to use similar methods. I can't think of an example where you have data but refuse to look at it until your company is "bigger". |
|
As a workaround, you could look for high VIF to detection multicollinearity, use some sort of stepwise selection / penalized regression, or use something like relaimpo (https://cran.r-project.org/web/packages/relaimpo/index.html) - not sure of a Python equivalent - to judge overall feature importance in the model.