Hacker News new | ask | show | jobs
by felippee 3466 days ago
Its rather the opposite. It is unsupervised initially, just learns to predict its input. Note there is confusion: the unit itself is using supervised (by the future signal) but all in all nothing needs to be labeled cause the reality just unfolds. Then, once this is done, one can use the trained features (representations) to train supervised tasks, such as the street sign tracking.

PS: yes, there is a strong literature suggesting that the brain is predicting a bunch of things. Check this long review paper http://www.fil.ion.ucl.ac.uk/~karl/Whatever%20next.pdf for plenty ideas and details.

1 comments

Oh, wow. So really this allows prediction of a complex input, in a general sense?

If I understand correctly, in context of that visual example, if it were trained with a moving camera and a static scene, then its prediction would only be able to predict scene transformations caused by that moving camera. Maybe this explains how the tracking somewhat fails when the ball is moving along the grass towards the end the scene. It doesn't "know" much about moving objects, just moving cameras? So then training with moving objects, would let it predict those, as well?

In what the video is showing, if it can predict perspective transforms from camera movement, like it seems to be doing, does that means it's making something like a 3d model, or something like a depth map used for its motion prediction, somewhere in there?

I would love to see the error video of some sort of rotating 3d wireframe that it was trained to.

This whole approach of a general "predictor" seems extremely incredible.