|
|
|
|
|
by Imnimo
1934 days ago
|
|
Well, the text encoder sees the ascii characters s-p-i-d-e-r (after byte-pair encoding). That's different from seeing a photograph of a piece of paper that says "spider" on it. It's not surprising that the network can associate a picture of spiderman with a caption that contains the text "spider", but rather that the same neuron lights up when you show it a piece of paper that says "spider" as when you show it a picture of spiderman. |
|