Hacker News new | ask | show | jobs
by liorben-david 1155 days ago
The paper seems to be using homology as the topological feature here. I've done some work in Topological Data Analysis before and it feels like the hidden issue is that computing homology is generally very inefficient(Since it usually amounts to reducing an nxn matrix).

It definitely feels like graphs/topology should be helpful tools to work with data(Since graph-like structures are good representations of the real world), but we need to solve this efficiency issue before this can be possible.

Also to address the confusion on how category theory comes into it, category theory studies abstract structures where you have objects and relationships between these objects. A lot of algebraic topology(Which is the sort of topology relevant here) is built in the language of category theory(Either by neccesity or by convention).

1 comments

I think most people in the field recognize this efficiency issue and focus on very small NNs. As of yet I have seen very little promising work on improving efficiency, other than work outside of NNs, e.g. Eirene for PH (which is actually a mess under the hood - someone needs to clean up the code base last time I checked - 9 deep nested for loops, really?). There are several options for topological NN layers out there now, but all of them are excruciatingly slow and blow up RAM (CPU run NNs, since typically >500 GB for fairly small problems) very quickly when I have tried them out.
Completely Agree. Part of the issue feels like that the people working on this are far more on the Math side than on the CS side. Like Gunnar Carlson's textbook and papers about Persistent Homology are geared towards algebraic topologists when there is probably a way to explain Persistent Homology that does not rely on category theory, polynomial rings, etc
For suitable specialized cases thins can be quite efficient. For persistent H_0 of VR-complexes in low-dimensional space there is an O(N log(N)) algorithm for N data points; that's decently fast. If you want H_1 I believe (but cannot prove) that there exists an O(N^2 log(N)) algorithm. Beyond H_1 things get painful. Since most PH software is written for general cases they don't tend to avail themselves of these special case shortcuts. Given that H_0 and H_1 of VR-complexes of low dimensional data covers a vast amount of the use cases I think specialized code for this would be worthwhile.