|
|
|
|
|
by tech_ken
820 days ago
|
|
How are you defining the utility of a metric here? It’s not clear to me why a locally-varying metric would be necessarily more 'useful' than a global one in the context of the manifold hypothesis. Moreover, if I’m understanding their argument right then W’W is proportional to an average of the exterior derivative of the manifold representing prediction surface of any given NN layer (averaging with respect to the measure defined by the data generating process). While this averaging by definition leaves some of the local information on the cutting room floor, the result is going to be far more interpretable (because we've discarded all that distracting local data) and I would assume will still retain the large-scale structure of the underlying manifold (outside of some gross edge-cases). |
|