Hacker News new | ask | show | jobs
by btilly 3035 days ago
Anyone interested in practical consequences of these counterintuitive properties should look up https://en.wikipedia.org/wiki/Curse_of_dimensionality for how it impacts things like machine learning where we naturally are working with many dimensions.
2 comments

The behavior the article is, to me, way more bizarre than the curse of dimensionality.

It's tempting to think of data sets as "point clouds". This article is a reality check for me: you can't safely apply intuition about 2- and 3-d point clouds to higher dimensional data. I suspect that this explains why methods like tSNE seem to produce unstable results depending on the parameters [0]. The notion of a "neighbor" in high dimensions is just not what I think it is.

I suppose the same is true for high-dimensional cost surfaces. Gradient descent is often described as "like walking down a hill". But without a deep understanding of high-dimensional geometry, I'm not at all confident that I know what a 4-, 10-, or 1000-dimensional hill looks like.

The lesson: Be skeptical of my own geometric intuition unless it is firmly backed by math.

[0]: https://distill.pub/2016/misread-tsne/

Indeed. I find https://www.thestar.com/news/insight/2016/01/16/when-us-air-... to be a good cautionary tale on how our intuition about people being close to average is misleading - nobody is. And nobody is particularly like anyone else, either.

On a thousand dimensional hill, my intuition is that it locally looks like a low dimensional hill, along axes that you can find through techniques like Principal Components Analysis. This has yet to mislead me. On the other hand, my pure math background was a long time ago, and I have not explored machine learning in any real depth...

But what this article makes me think is that even "low" dimensions can't be trusted, as long as it's greater than 3.
indeed ! the 'neural networks for pattern recognition' book (bishop) has exercises in chapter-01 on the same subject.