Hacker News new | ask | show | jobs
by teddykoker 1422 days ago
Link to paper [1]. It looks like the authors construct a basic autoencoder to predict frames in videos of various videos (double pendulum, lava lamp, etc.) and then use the Levina–Bickel algorithm [2] to determine the expected "intrinsic dimension" of the latent space of the autoencoder. They then refer to the size of the intrinsic dimension as the "minimum number of variables required by the system to accurately capture the motion.", e.g. 24 for a video of a fireplace.

Personally, I wonder how much information this actually provides about a system. Since the neural network is non-linear, a single latent variable may theoretically function as more than one state variable.

[1]: https://www.nature.com/articles/s43588-022-00281-6.epdf?shar...

[2]: https://www.stat.berkeley.edu/~bickel/mldim.pdf

1 comments

Chess experts are only good at remembering valid chess positions that could result from a game in play. Their "expertise" vanishes why trying to remember chess boards set up at random (invalid) positions.

It could be that there are a few valid states that can be highly encoded in each of these situations, and the autoencoders found them.

This could be highly useful.