| HN Mirror

That[1] used a hand picked set of ambiguous images and still got 60% overall accuracy across 11k participants. I don't know much about statistics[2], but 1) 60% HAS to be statistically significant, and it was 2) under ADVERSARIAL, not neutral, condition. So people can tell.

Anyways, that's besides my point. The point of mine is that, it always turn into all-caps flamewars like this, with no middle ground or third camps, and that this has to be more of a phenomenon than regular disagreements. This isn't bikeshedding. This is Spanish bullfighting centered around a piece of red cloth.

1: https://news.ycombinator.com/item?id=42216694

2: I just asked Gemini "is 60% accuracy over 11k participant for a test statistically significant and why", it said "yes, it is overwhelmingly statistically significant" and "completely off the charts". They said p<0.05 figure would be 50.94%.