|
|
|
|
|
by nerdponx
3034 days ago
|
|
The behavior the article is, to me, way more bizarre than the curse of dimensionality. It's tempting to think of data sets as "point clouds". This article is a reality check for me: you can't safely apply intuition about 2- and 3-d point clouds to higher dimensional data. I suspect that this explains why methods like tSNE seem to produce unstable results depending on the parameters [0]. The notion of a "neighbor" in high dimensions is just not what I think it is. I suppose the same is true for high-dimensional cost surfaces. Gradient descent is often described as "like walking down a hill". But without a deep understanding of high-dimensional geometry, I'm not at all confident that I know what a 4-, 10-, or 1000-dimensional hill looks like. The lesson: Be skeptical of my own geometric intuition unless it is firmly backed by math. [0]: https://distill.pub/2016/misread-tsne/ |
|
On a thousand dimensional hill, my intuition is that it locally looks like a low dimensional hill, along axes that you can find through techniques like Principal Components Analysis. This has yet to mislead me. On the other hand, my pure math background was a long time ago, and I have not explored machine learning in any real depth...