|
|
|
|
|
by boombard
4219 days ago
|
|
Nice article. You could consider maximum spanning trees as a way to prune your correlation graph; they are very effective at suggesting underlying structure or kinetics of a system. Just use the minimum spanning tree algorithm with the inverse of your correlation. [1] http://en.wikipedia.org/wiki/Spanning_tree Another approach is to use PCA on the adjacency matrix. This can generate interesting clusters based on the latent variables. At the risk of self promotion I co-authored a paper on this technique which validated known pathways in a metabolic network [2] http://www.biomedcentral.com/1471-2105/13/197 Anyway this is a great field to explore, glad to see it getting traction on HN! |
|
If several dimensions are correlated just about equally strongly, you can get very different trees based on small random variation. There's no guarantee that all significant correlations are displayed, or that correlated dimensions are visually close to one another.