|
|
|
|
|
by bcheung
2996 days ago
|
|
An encoder / decoder architecture learns a more "efficient" representation. It tries to find features it can use that are useful for describing the variations in the input data (images) that it has seen. For example, if trained on faces, it will learn features for things like eyes and mouths. So the image can be encoded as put a mouth of this type with this width at this location rather than operating at the level of pixels. If trained on text, it might learn features related to letters and typography (boldness, italics, size, spacing). So it might encode things as Helevetica, 16pt, italics. This is a gross oversimplification, and things rarely map exactly to concepts humans would use, but hopefully it communicates the concept. |
|