|
|
|
|
|
by iknownothow
735 days ago
|
|
But don't input embeddings need to undergo backprop during training? Won't the external-model's embeddings just be noise since they don't share embedding space with the model that is being trained? If the external-model also undergoes training along with the model then I think that might work. |
|