| We mostly use via pygraphistry, and the demo folders have a bunch of examples close to what we do in the wild: https://github.com/graphistry/pygraphistry Ex: ```
import graphistry graphistry.nodes(alerts_df).umap().plot() ``` That's smart library sugar for: ``` g = graphistry.nodes(alerts_df) g2 = g.featurize(*cfg) # print('encoded', g._node_features.shape) g3 = g2.umap() # print('similarity graph', g._nodes.shape, g._edges.shape) url = g3.plot(render=False) print(f'<iframe src={url}/>') ``` If automatic cpu/gpu feature engineering happens across heterogeneous dataframe columns, that's via pygraphistry's automation calls to our lower-level library cu_cat: https://github.com/graphistry/cu-cat We've been meaning to write about cu_cat with the Nvidia RAPIDS team, it's a cool GPU fork of dirty cat. We see anywhere from 2-100X speedups on cpu -> gpu. It already has sentence_transformers built in. Due to our work with louie.ai <> various vector DBs, we're looking at revisiting how to make it even easier to plug in outside embeddings. Would be curious if any patterns would be useful there. Prior to this thread, we weren't even thinking folks would want images built-in as we find that so context-dependent... |