|
|
|
|
|
by moshegramovsky
689 days ago
|
|
This is mostly a data structure problem. I am certain this can be made interactive, but it will require some elbow grease. If you want it to be interactive, you will need to figure out a few things: 1.) how to format the data so it can be streamed off disk.
2.) how to cull the offscreen bounding boxes quickly.
3.) how to cull tiny bounding boxes quickly. The central problem is finding a way to group the nodes efficiently into chunks. A 2D approach is probably best. You would then have something that could be rendered efficiently. Other than that, maybe a point cloud renderer? There might be one you can buy off the shelf, or something open source. |
|
You can do the hierarchical clustering using HDBScan probably in reasonable time, it's a fast algorithm.
To have any sort of 2d display you need to project the nodes, which might require some form of PCA given the data set size. UMAP might also work.
From there, you can use an R* tree in conjunction with "cut-depth" cluster segmentation tied to zoom level with additional entity selection based on count and centrality. If you load it in postgres PostGIS can do this in one query.
All pretty straightforward stuff.