The first processing step consist from (human) face detection. We use the standard OpenCV for our faces.ethz.ch demonstrator. A failure of this step is likely to propagate in the unreliable/wrong attractiveness prediction. For attractiveness, age, and gender prediction we start from a cropped image assumed to contain a (roughly aligned) face as found by the detector.
I hope that this helps to understand the aforementioned result.
I hope that this helps to understand the aforementioned result.