Hacker News new | ask | show | jobs
by daniel-levin 3781 days ago
This comment got me thinking: in some applications, Euclidean distance between feature vectors acts as a good proxy for adjacency/similarity. For such applications, an isometry from R^n to R^2 or R^3 should in principle preserve the meaning of adjacency. A quick Google yields [0, 1] a technique for quasi-isometric, and isometric dimensionality reduction. This should mitigate artefacts of adjacency, or non-adjacency, as it were. In other words, you might be able to actually pull off good 2D projections of high dimensional data and still see meaningful relationships.

[0] https://en.wikipedia.org/wiki/Isomap

[1] https://www.aaai.org/Papers/AAAI/2007/AAAI07-083.pdf

1 comments

Sammon mapping is another famous example, see [1] for instance for a nice visualization.

[1] http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/AV09...

>> Provides us with a measure of the quality of any given transformed dataset. However, we still need to determine the optimal such dataset, in terms of minimising E. Strictly speaking, this is an implementation detail and the Sammon mapping itself is simply defined as the optimal transformation;

Somehow its technically challenging to verify the content of this article.

I was referencing it mostly for the visualization of the "flower" that fails with pca/linear mapping.

The original Sammon's paper is here [1], this said from what I know isomaps are a more widespread tool - but I never found such a good visualization.

[1] http://theoval.cmp.uea.ac.uk/~gcc/matlab/sammon/sammon.pdf