|
yes, this is a good example! We model events this way: endpoint, network, web logs, click streams, anything with entities like pids/IPs/users/files/urls and supporting metadata like hashes and timestamps. Connections can be discrete like "same hash" and trickier, fuzzy like "similar shape" or "time-correlated". Microsoft's Matt Swann has a great talk on how they do that at BlueHat, and Graphistry's talk there gives a bunch more examples (see our site + pinned on Twitter.) Emerging GCNs from last ~year can mix many discrete + continuous attributes, and for why graph vs tabular, better pull in connected data ("process recently accessed a file and communicated to another process, who also recently made an outbound with similar bytes to another server that few others use.") Before, GNN tools struggled on either the heterogeneity or scale of most event data tasks, so were mostly academic research or super specialized like protein folding. Imagine 100K+ events, or distinguishing connections like someone positively chatting with a colleague vs negatively blocking them. Recent algs have been solving these scenarios at scale, tho tools are still hard even for most data scientists, and still a time suck for folks not already breathing GPU neural networks. The Graphistry team has been trying to make graph tech easier and more snap in, initially by doing GPU-accelerated visual graph analytics + visual graph investigation templating for teams here, such as if you are on an ir/hunt team using a logs DB or even a graph DB. Ex: You can use directly from splunk queries, drag-and-drop with csv extracts, or data science notebooks with dataframes. That can be a nice launching point for seeing event data as graphs and the types of graph-enabled insights you get that are hard with just log dumps and bar charts. We are now in R&D with sec/fraud/etc teams and our GPU partners to bring GNNs into workflows w hopefully similar ease If anyone is interested, Leo@graphistry.com or swing by our Slack. Very cool time for graph + GPU tech! |