Hacker News new | ask | show | jobs
by LeoJiWoo 3117 days ago
That is concerning. I knew over-fitting was a real issue with AI.

Still haven't found a good way to deal with high bias or high variance myself.

1 comments

Cross validating the classifier/hyper parameters and a good scoring metric (Matthews correlation coefficient) go a long way. Since the classes are very imbalanced, an appropriate scoring metric is very important. Even more importantly, train with lots of high-quality data whenever possible. Anecdotally many seem to obsess over the particular classification algorithm, while neglecting data quality. A classifier is only ever as good as its training set.