Hacker News new | ask | show | jobs
by yashvg 503 days ago
This behavior likely stems from RLHF training - at least for the part after the images of the scatter plot are given to the models. Models were probably heavily penalized during training for pattern-matching that could lead to problematic racial misclassifications, similar to the issues Google faced with their image recognition systems in 2015. The tendency to be overly cautious about recognizing primate shapes, even in abstract data visualizations, could be an emergent behavior from these training constraints.