Hacker News new | ask | show | jobs
by Gladdyu 2736 days ago
How would this handle systems in which the dimensionality of the hidden layers is not equal to the input dimension, as is often the case? For solving the ODE you'd need to get at least 1 sample point, which might restrict the amount of information you can capture in such a network.
3 comments

This paper is explicit that it only applies to sequences of ResNet layers, which happen to have the property that the input and output dimensionalities are equal.

It’s a property of the functional form of stacked ResNet layers that allows the ODE layer to be used instead.

You are right that many networks will require at least some (and usually many) layers that change dimensionality. So ODE layers are not going to wholesale replace anything, apart from possibly submodules of a bigger network where the submodule is made up just of sequences of ResNet layers.

ODE layers will also have new applications, such as for irregularly spaced sequential inputs or outputs.

I imagine that if you have one discrete layer from input to hidden that you could then treat the hidden layer(s) dynamics as continuous from there on until the next change of dimension. You should still be able to back propagate through a continuous-discrete or discrete-continuous boundary. There should be a lot of literature on this in control systems simulation.

The other thing you can do is to "pad" the input up to be the size of the hidden layers.

> How would this handle systems in which the dimensionality of the hidden layers is not equal to the input dimension

You could use the ODE solver to integrate the system from t1 to t2a. Then you need one normal neural net layer (one forward Euler time step) that takes the system from t2a to t2b and changes dimensions. Then you can again use the ODE solver to integrate from t2b to t3.