Hacker News new | ask | show | jobs
by MikeUt 1768 days ago
I'm confused. Other than alerting to small sample size, how can this tool measure bias without knowing what the true distribution is?

For example, given a database of violent crime convictions, would it be marked as biased because men make up over 70% of offenders, instead of an "unbiased" 50%?

1 comments

This initial version assumes that the entire dataset is a fair representation of the underlying phenomena, i.e. the "true" distribution can be estimated from data already. As the package develops and users of the package have more knowledge about different true distributions, the tool should be able to operate without this assumption. Hence aggregating knowledge about different distributions to eventually remove the assumption is part of the roadmap.