Hacker News new | ask | show | jobs
by tjungblut 996 days ago
I don't immediately find it, but couple of years back there was a "meta-feature" which was the size of the MNIST image. I think that scored about 90'ish % accurate results on its own - without even looking at the image.
2 comments

A few years back I worked on a project that involved fingerprinting screenshots of web pages, and compressed image size was pretty much as good as any fingerprinting method we could come up with for comparing the similarity between them.
The off-the-beaten-path nature of this reminds me of banks sending $<arbitrary decimal amount> as a PIN for auth.
Makes sense, considering that's not too terribly far off from what the KL divergence does
What do you mean by “size”? Gzipped size? If you simply look at how dark a Mnist image is (count the percentage of dark pixels) you’ll get about 20% accuracy, which is twice better than random guess but a long way from 90’ish %.
What do you mean with accuracy here? Usually 50% accuracy means cointoss, meaning 20% accuracy is equal to 80% accuracy, which is better than the article's 78% and not that far from 90%.
> meaning 20% accuracy is equal to 80% accuracy,

Only if your model is outputting a yes/no answer right? And that your definition of accuracy is "class with highest probability" (and not "N classes with highest prob")

If your dataset has more than 2 classes like MNIST, a super low accuracy only tells you to ignore the class the model guesses. It doesn't tell you which of the remaining classes is correct

There are ten choices, so getting the answer right 20% of the time is very plausible.
"One simple trick to beat the statistical odds...."
There are 10 classes in Mnist. Random guess would is 10%.