|
|
|
|
|
by gcmac
3099 days ago
|
|
I disagree with the analysis of this article. In a typical machine learning process, the response variable stays the same (at a distributional level) but you cycle through candidate models. So regardless of whatever the class distributions are, a higher AUC score indicates a better model. It might be true that the classifier performance is worse on an imbalanced data set (with the same AUC score) than a balanced one but that just reflects the fact that classifiers are harder to build for imbalanced data. |
|
See http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.98....