| HN Mirror

I don't think it quite says that. Exact quote:

> We assume the failure cases are related to two reasons. On one hand, the GOD training set and testing set have no overlapping classes. That is to say, the model could learn the geometric information from the training but cannot infer unseen classes in the testing set

Now, if all the GT results had been fails, it might be reasonable to conclude that it doesn't work if the sets don't overlap. However, there are only 6 that they graded as fails. (A few more look iffy to me.) If I'm reading their statement correctly, there was no overlap between the two sets:

> This dataset consists of 1250 natural images from 200 distinct classes from ImageNet, where 1200 images are used for training. The remaining 50 images from classes not present in the training data are used for testing

And if I'm understanding this correctly, that makes the results look sort of impressive. I mean, at the very least, the model is getting the right class from the testing set most of the time, even though that class wasn't in the training set. That's ... not ... nothing?

On the other hand it seems they cherry picked the best of five subjects for the results they show in the supplementary, which is ridiculous.

> Subject 3 has a significantly higher SNR than the others. A higher SNR leads to better performance in our experiments, which has also been shown in various literature.