|
When I read a paper like this I'm looking for four things: (1) the data, (2) the benchmarks, (3) the architecture, (4) the controls/ablation. 1. The data: "We used a sample of 1,085,795 participants from three countries (the U.S., the UK, and Canada; see Table 1) and their self-reported political orientation, age, and gender. Their facial images (one per person) were obtained from their profiles on Facebook or a popular dating website... Facial images were processed using Face++37 to detect faces. Images were cropped around the face-box provided by Face++ (red frame on Fig. 1) and resized to 224 × 224 pixels." 2. The benchmarks: "For example, when asked to distinguish between two faces—one conservative and one liberal—people are correct about 55% of the time." 3. The controls: "What would an algorithm’s accuracy be when distinguishing between faces of people of the same age, gender, and ethnicity? To answer this question, classification accuracies were recomputed using only face pairs of the same age, gender, and ethnicity." A. A complaint: Geography and income are two powerful conditioners. These can leak in so many ways: uncropped background (geography), image color and quality (income), eyeglass shape (geography and income). This study really needs more controls. Geography and income would be a nice start. |
> Their facial images (one per person) were obtained from their profiles on Facebook or a popular dating website
so of course the first thing to comes to mind is "how good of a predictor is just knowing which of those two sites the image came from?"