This wasn't so much overtraining, as the models learning something different than what we expected. If you look at a pixel by pixel representation of an image, textures tend to be more significant/unique patterns than shapes. There are some funny studies from the mid 2010s exploring this.