| HN Mirror

I read the paper, and I also have some reservations. The procedure they used to extract and randomize their data seems biased towards large homogenous areas.

In short, in their procedure, it seems possible to rope off a large contiguous area of Mojave desert, ground-truth it using their GUI system as "barren", and have that area be carved up into 28x28 pixel chips and spread equally into the training and test sets.

In such a case, the training and test sets are not really independent. And their 6 classes, as you point out, are amenable to color features.

Having done classification of remote sensing data...the above is not a good test of accuracy at any useful task. You have to test accuracy on representative data.

That means training within a few areas, and testing on geographically distant but ecologically similar areas. (I.e., same class, but statistically independent.). And, varying things like time of day, observing geometry, and seasonality. Color features will be quite fragile in such tests.

And, testing on a more diverse sample, to see if "none of the above" can be detected, because their class decomposition is nothing like exhaustive.