|
Re:classification... afaict, only in the rare case of complete graphs. You can extract those.. but they're dull. The use of rainbow coloring IS indicative of what folks do for classification & motif finding already on real-world graphs and even tables: -- Spectral methods: Similar to looking at the top eigen values for PCA, they score the connectivity matrix. I haven't seen a great explanation of this, but the Microsoft Defender talk at BlueHat last year shows a decent place where to use it. -- Counting types in weakly connected components of a property graphs. Most graphs connect already-classified entities, and tons of methods let you get components out of them. You can get quite far just by making a vector like "cluster 3 has: two reds, three yellows". For a security graph, something like, "3 different IPs, and two alerts of type X". Matt Swann's talk at last year's https://www.graphtheplanet.com/ on a MS internal security team's scalable approach is great here. (We're about to release the last wave of tickets to this year's GraphThePlanet, if you're in SF next week!). -- Tables: At the same Microsoft BlueHat event as the spectral methods talk, I demoed how to understand the structure behind a log alert dump via a simple & automatic 'hypergraph transform' that lets you use any of these methods even on regular data tables and logs: https://twitter.com/Graphistry/status/1189966458844930048 (jump to 12:45). This is how ~half of our users use Graphistry, and a generalization of what Matt does. There are a bunch of papers on 'motif mining', and in deep learning, handling graph-y domains via random walks, and they often come down to a variant of #2 above. For deep learning, I'm curious about combining w/ #3 to go from 2010's era ~perception-on-pixels -> 2020 era ~behavior-on-traces. |
One general interpretation is that the eigenvalues of the adjacency & other matrices that capture the connectivity of the graph are solutions to optimization problems [0]. Usually, these optimization problems and their solutions correspond to relaxations of interesting combinatorial features or operations on graphs - e.g. different ways of cutting graphs to yield binary partitions on the nodes (spectral clustering [1]).
[0] https://en.wikipedia.org/wiki/Min-max_theorem
[1] https://en.wikipedia.org/wiki/Spectral_clustering