The wiki post mentioned that people are right 45% of the time. How does dl and ml stack up? Also not mentioned, how long does it take for humans to start recognizing new races, vs ml and dl?
Facial features can vary from ethnicity to ethnicity (see e.g., https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3074358/), and so a machine learning model trained solely on pictures from one ethnic group may not understand how to reliably distinguish different people of another ethnicity.
Human eyes also need to be trained on diverse data. It's the cross-race effect: https://en.wikipedia.org/wiki/Cross-race_effect