|
|
|
|
|
by timeattack
498 days ago
|
|
That part I understand and it is quite easy to imagine, but that mental model means that novel data, not present in dataset in a semantical sense, can not be mapped to any exact point in that latent space except to just random one, because quite literally this point does not exist in that space, so no clever statistical sampling would be able to produce it from other points.
Surely, we can include hex-encoded knowledge base into dataset, increase dimensionality, then include double-hex encoding and so on, but it would be enough to do (n+1) hex encoding and model would fail. Sorry that I repeat that hex-encoding example, you can substitute it with any other example. However, it seems that our minds do not have any built-in limit on indirection (rather than time & space). |
|
This is your error, afaik.
The idea of the architecture design / training data is to produce a space that spans the entirety of possible input, regardless of whether it was or wasn't in the training data.
Or to put it another way, it should be possible to infer a lot of things about cats, trained on the entirety of human knowledge, even if you leave out every definition of cats.
See other comments about pre-decoding though, as expect there are some translation-like layers, especially for hardcodable transforms (e.g. common, standard encodings).