|
|
|
|
|
by hh1
767 days ago
|
|
When you talk about "c" or "scalar memory" in the paper, does that refer to a single unit in the vector usually referred to as c? So in mLSTM, each unit of the vector c is now a matrix (so a 3d tensor)? And we refer to each matrix as a head? Having a bit of issue understanding this fundamental part |
|
For the matrix 'C' state, there are also heads/cells in that sense that you have multiple, but they don't talk to each other. So yes, you can view that as a 3D tensor. And here, the matrix is the fundamental building block / concept.