|
|
|
|
|
by tremon
1907 days ago
|
|
No, you don't "know" that your dataset is biased until you perform the statistic analysis explicitly. It might be that your neural net has a non-uniform weight distribution in some dimension (e.g. in time, or in the ordering of the training data), so dismissing any unwanted results by claiming "your dataset is biased" is a form of appeal to (artificial) authority. |
|
edit: And a statistical analysis isn't some sort of magic data genie. Statistics can give rigorous results because it makes strong assumptions. If those assumptions don't hold then the results aren't rigorous anymore. A trillion parameters model can pull interactions out of your data that almost no statistical analysis of the data would identify ahead of time. So what you need to analyze is the model and try to infer why it's predicting different certain results and then work backwards from there.