Hacker News new | ask | show | jobs
by Blikkentrekker 1641 days ago
I would assume these data sets are not manually selected but imported from some mechanism.

Other issues which are sure to arise is that the a.i. will have trouble with people who aren't smiling, and that the data set probably contains people who look better than average, and almost certainly excludes people who suffer from injuries or deformities in appropriate proportions.

Perhaps an interesting project is simply the compilation of a vast dataset of “world proportional pictures of people”. — It would be an interesting undertaking to realize such a dataset.

1 comments

World proportional is not good enough for this type of task. If we are to rely on AI for things like identifying people in pictures in a trial, we would need equal representation in the data set, so the AI doesn't have any kind of systematic bias. Otherwise, the AI's bias will compound errors in the real world. So you would need as many pictures of Australian aborigenees in the data set as Han Chinese people if you wanted to be sure there isn't a risk that a random person would be confused for someone of the over or under represented groups.