|
|
|
|
|
by chromanoid
1935 days ago
|
|
I am not sure if I get it, but isn't the CLIP "multi-modal" neuron just considered as "multi-modal" because it occurs some layers before the actual output? I am not sure, but maybe the indirection is just obfuscation and not a sign of abstraction. |
|